Parsing Apple System Log (ASL) files on iOS and OSX for Fun and Evidence (and a Python script to do it for you)

(If you’re dying to get stuck in and are only after the links to the Python scripts, they can be found at the bottom of the post!)

After every update to iOS I like to take a file system dump of one of our test iDevices and have a poke around to see what’s changed and what’s new. Recently, on one of my excursions around the iOS file system, I came across something that looked promising that I hadn’t dug into before: a bunch of files with the “.asl” extension which were located on the data partition in “log/DiagnosticMessages”. There were lots of them too –each with a file name referring to a particular date – they went back months!

DiagnosticMessages file listing

Log Files!

“Loads of lovely log files!” I thought to myself as I excitedly dropped one of the files into my current text editor of choice (Notepad++ if you’re interested) only to be disappointed by what was clearly a binary file format.

ASLDB File in a text editor

Curses!

So I headed over to Google and entered some hopeful sounding search queries and came across a very useful blog post (http://crucialsecurityblog.harris.com/2011/06/22/the-apple-system-log-%E2%80%93-part-1/) which described the role of ASL files on OSX and listed some ways for accessing the logs from within OSX, but I was interested in gaining a better understanding of the file format (besides, the nearest Mac to me was on a different floor!).

A little more digging revealed that the code that governed the ASL logging, and the files it generated were part of the Open Source section of OSX, as a result I was able to view the code that was actually responsible for creating the file – my luck was looking up!

The two files I was particularly interested in were “asl.h” (most recent version at time of posting: http://opensource.apple.com/source/Libc/Libc-763.13/include/asl.h) and “asl_file.h” (most recent version at time of posting: http://opensource.apple.com/source/Libc/Libc-763.13/gen/asl_file.h). C header files are great; basically, their purpose is to define the data structures that are subsequently used in the functional code, so when it comes to understanding file formats, quite often they’ll tell you all you need to know without having to try and follow the flow of the actual program. Better yet, these files were pretty well commented. I know that not everyone reading this is going to want to read through a bunch of C code, so I’ll summarise the file format below (all numeric data is big endian):

First, the file Header:

Offset Length Data Type Description
0 12 String “ASL DB” followed by 6 bytes of 0x00
12 4 32bit Integer File version (current version is: 2)
16 8 64bit Integer File offset for the  first record in the file
24 8 64bit Integer Unix seconds timestamp, appears to be a file creation time
32 4 32bit Integer String cache size (not 100% sure what this refers to, may be maximum size for string entries in the records)
36 8 64bit Integer File offset for the last record in the file
44 36 Padding Should all be 0x00 bytes

So nothing too ominous there, although all of those pad-bytes at the end of the header suggest redundancy in the file spec in case apple ever fancy changing something. Indeed the fact that the header tells us that we’re on version 2 of the file format suggests that this has already happened.

The records in the file are arranged in a “doubly linked list”, that is, that every record in the file contains a reference (ie. the file offset of) the next and previous records.  From a high level, the records themselves are made up of a fixed length data section, followed by a variable length section which allows the storage of additional data in a key-value type structure, finally followed by the offset of the previous record. The table below explains the structure in detail.

NB: The string storage mechanism the records use is a little bit…interesting – I’ll explain in detail later in this post, but for now if you see a reference to an “ASL String”, I mean one of these “interesting” strings!

Offset Length Data Type Description
0 2 Padding 0x00 0x00
2 4 32bit Integer Length of this record (excluding this and the previous field)
6 8 64bit Integer File offset for next record
14 8 64bit Integer Numeric ID for this record
22 8 64bit Integer Record timestamp (as a Unix seconds timestamp)
30 4 32bit Integer Additional nanoseconds for timestamp
34 2 16bit Integer Level (see below)
36 2 16bit Integer Flags
38 4 32bit Integer Process ID that sent the log message
42 4 32bit Integer UID that sent the log message
46 4 32bit Integer GID that sent the log message
50 4 32bit Integer User read access
54 4 32bit Integer Group read access
58 4 32bit Integer Reference PID (for processes under the control of launchd)
62 4 32bit Integer Key-Value count: The total number of keys and values in the key-value storage of the record
66 8 ASL String Host that the sender belongs to (usually the name of the device)
74 8 ASL String Name of the sender (process) which send the log message
82 8 ASL String The sender’s facility
90 8 ASL String Log Message
98 8 ASL String The name of the reference process (for processes under control of launchd)
106 8 ASL String The session of the sender (set by launchd)
114 8 * Key-Value count ASL String[Key-Value count] The key-value storage: A key followed by a value, followed by a key followed by a value… and so on. All keys and values are strings

The level field mentioned above will have a numerical value which refers to the levels shown below:

Level Meaning
0 Emergency
1 Alert
2 Critical
3 Error
4 Warning
5 Notice
6 Info
7 Debug

As mentioned, the “ASL String” data type is a little odd. The ASL fields above take up 8 bytes, if the most significant bit in the 8 bytes is set (ie is 1), the rest of the most significant byte gives the length of the string, which occupies the remaining 7 bytes (unused bytes are set to 0x00). Conversely, if the top bit in the ASL String data type is not set (ie. Is 0) the entire 8 bytes should be interpreted as a 64bit Integer which gives the file offset where the string can be found. The string will be stored thusly:

Offset Length Data Type Meaning
0 2 Padding Padding bytes 0x00 0x01
2 4 32bit Integer String length
6 String length UTF8 String (nul-terminated) The string data

In order to get a better grip of what can be held in these files I decided to create a Python module to read these files and used it to dump out the contents of the ASL files I found on the iPhone.

Running the script

Running the script

Output from the script (iOS)

A snippet of the output produced by processing an iPhone’s ‘DiagnosticMessages’ folder

The first thing that struck me after running the script was the volume of messages: 16161 log messages spanning 10 months – and this was on a test handset which had lay idle for weeks at a time. The second thing was the prevalence of messages sent by the “powerd” daemon, over 87% of the messages had been sent by this process. The vast majority of these messages related to the device waking and sleeping – not through user interaction, but while the device was idle. Most of these “Wake” events occurred 2-5 minute apart, presumably to allow brief data connectivity to receive updates and push messages from apps.

Output from the script (iOS powerd messages)

Some powerd Wake and Sleep messages

The key thing that interested me about these messages was that they also noted the current battery-charge percentage in their text: this is the sort of data that just begs to be graphed, so I knocked up a little script which utilised the parsing module I had just written to extract just this data and present it in a graph-friendly manner.

Graph Friendly powerd Data

Graph Friendly Data

After graphing it (you want to use a scatter graph in Excel for this, not line as I discovered after some shouting at my screen) you are left with a graph which gives you some insight into the device’s use.

iOS Battery Use Graph

Some iPhone Power Usage (click for full-size)

The graph above shows around 3 weeks of battery usage data from the test handset. As noted previously, this test device would lay idle for days at a time (as suggested by the gentle downward gradients) but there were periods when the handset was in use, as shown by the steeper downward gradients on the 26th and 27th of April, which mostly took place within office hours. You can also clear see the points where the device was plugged in to be charged, suggested by the very steep upward gradients. As noted the power messages occur around ever 2-5 minutes, so the resolution is actually fairly good. The exception to this is while the device is plugged in as it no longer needs to sleep to preserve battery charge; typically I only saw an event when charging began and another when the device was unplugged and the battery began to discharge again.

There are a few other messages in the iOS ASL log that look interesting, but at this time I don’t have enough nice control data to make much of them. One thing that did hearten me somewhat was the fact that on the few extractions I’ve had the opportunity to take a look at from later revisions of iOS 5, there did seem to be some extra processes that were logging messages, so it’s my hope that we’ll see more and more useful data make its way into the ASL logs on iOS.

In addition to looking at iOS ASL files, I thought I’d take a look at some from an OSX installation. Pulling the logs from the “var/log/asl” on Lion (10.7.3) and running the parsing script across the whole directory brought back a far more varied selection of messages.

Output from the script (OSX)

Variety is the spice of life.

The number of records returned was actually far less than on iOS, partially due to the iOS “powerd” being so chatty, but more crucially because OSX tidies up its logs on a weekly basis. That’s not to say that you will only recover a week’s worth of logs though – on this test machine I recovered logs spanning 7 months. Rather, OSX has short-term log files (those with file names which begin with a timestamp) which have a shelf-life of a week and long term log files (those with file names which begin with “bb” followed by a timestamp). The “bb” in the long term log’s file name presumably stands for “best before” and the date, which is always in the future, is the date that the file should be cleared out. The short term log files tend to hold more “intimate” entries, often debug messages sent from 3rd party applications; the long term logs err more on the side of system messages. One particularly useful set of messages in the long term log are records pertaining to booting, shutting down, logins and logouts (hibernating, waking and failed logins are recorded too, but they end up in the short-term logs).

(As an aside:  one of my favourite things that I discovered when looking through these logs was the action of waking a laptop running OSX by wiggling a finger on the trackpad is recorded in the logs as a “HID Tickle”. Lovely.)

Like I did with the iOS power profiling, I put together a script which extracted these login and power records and timelines them.

OSX Login and power timeline

Login and power timeline

A couple of things worth noting beyond the basic boot/shutdown/login records: firstly when the device wakes it records why it happened – this can be quite specific: a USB device, the lid of a laptop being opened, the power button being pressed, etc. Secondly, you can see terminal windows (tty) being opened and closed as opening a terminal window involves logging in to a terminal session (OSX does this transparently, but it’s still logged).

We’ve released the scripts mentioned in this post to the community and they can be downloaded from https://code.google.com/p/ccl-asl/. The “ccl_asl” script is both a command line utility for dumping the contents of ASL files as well as a fully featured class module which you can use to write scripts along the lines of the battery profiler and login timeline scripts.

ASL files are, on the one hand, fairly dry system logs, but on the other, with a little work you can harvest some really insightful behavioural intelligence. As always if you have any questions, comments or suggestions you can contact us on research@ccl-forensics.com or leave a comment below.

Alex Caithness

Advertisements

5 thoughts on “Parsing Apple System Log (ASL) files on iOS and OSX for Fun and Evidence (and a Python script to do it for you)

  1. Hi, I don’t know if you’ve already seen my post in the Education forum, but this is almost exactly what I was considering covering for my final year project (I want to look into the Mac OS X logs, but on preview the similarities are obvious). There’s just one problem – my tutor doesn’t like the scarcity of published journal articles. Are you aware of any peer-reviewed articles relating to this topic? I’ve searched but haven’t been able to find any.

    • Joash,

      I searched a fair bit before I went on with writing this post – I’ve linked on this to only other resource that was easy to find – not a peer reviewed paper though.

      That said, there are papers out there which are more generally about log analysis (google: “forensic log analysis” and there are some on the first few pages) – perhaps those would would satisfy your tutor?

      There are certainly a number of other interesting artefacts in the OSX ASL logs that I came across, but didn’t have time to dig into in detail. The short lifespan of the (some of the) logs on OSX is a bit annoying – would be interested in the viability of recovering deleted logs – I started considering a way to carve them reliably based upon the information in the header and the last record, but haven’t written any scripts or anything yet; that might be something else to consider in your paper (if you’re allowed to do it!)

      Hope that helps,
      Alex

      • Yes, I found that blog you linked to as well, plus a few other scraps. Clearly in academic terms there’s a gap in the market here so I will try to persuade him I should fill it. Thanks for your advice!

  2. Pingback: Chrome Session and Tabs Files (and the puzzle of the pickle) | Digital Investigation

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s