As anyone who has examined an iOS device (or an OSX device for that matter) will know, property list files are a major source of potential evidence. Being one of the main data storage formats they might contain anything from configuration details to browsing history to chat logs.
We’ve examined property lists in detail previously, covering their file formats and the challenges they might present (you can download the white paper here). We have also released our tool PIP, which can be used to parse data from plists and other XML data in a structured way.
As regular readers might have gathered I’m pretty keen on the Python scripting language and while the built-in libraries do support reading from the XML property list format (available through the plistlib module) there is no support for the binary property list format which is increasingly becoming the standard format.
Recently I had a task where I needed to parse a large number of binary format property list files inside a script I was writing. It was theoretically possible to have the script export them all to a single location and then perhaps run PIP separately but I really wanted to have everything self-contained and besides, I needed to make use of the data extracted from the plist files later on in the script.
One of the beauties of Python is how quickly you can go from concept to “product”, so I decided that rather than waste time finding a work-around I would craft a proper solution. After a few hours’ work I had a fully functioning module for dealing with binary plist files in Python and today we’re excited to be releasing the code so that other practitioners and coders can make use of it.
I designed the module to provide the parsed data in as native a format as possible (see Table 1) so when writing your code you do not have to deviate from the normal Pythonic constructs – the data structure returned contains everyday Python objects. The data structure returned is also vastly interchangeable with that returned by plistlib’s “readPlist” function (the exception being that plistlib wraps “data” fields in a “Data” object rather than giving direct access to the underlying “bytes” object).
The module only has one function that you need to know about in order to use it: “load()”. This function takes a single argument which should be a binary file-like object* and it returns the python representation of the data in the property list.
In addition to the ccl_bplist module, the Google Code repository contains an example of the module in use, parsing the “IconState.plist” from an iOS device auditing the Apps and folders present on the Springboard home screen.
We really hope that the community will be able to make use of this module and if you have any questions please leave a comment or email us at email@example.com.
*When working from a file on disk you should use the open() function with “b” in the mode string e.g.: open(file_name, “rb”).
There may be other times when the bplist has come from another source, e.g. a BLOB field in a database or even a property list embedded in another’s data field. In these cases you can wrap a bytes object in a BytesIO object found in the “io” module e.g.: io.BytesIO(some_data).