New Epilog Signature files released

Epilog Signature files allow users to add specific support for new databases they encounter and although they are designed so that Epilog’s users can create their own signatures when the need arises, CCL-Forensics are committed to updating and releasing a sets of signatures, pre-written and ready to use.

In this new release we have had a real focus on smartphones adding support for:
• iOS6
• Android 4.0 (Ice Cream Sandwich)
• Android 4.1 (Jelly Bean)
• Android 3rd Party Applications
• iOS 3rd Party Applications
• Skype

We always welcome suggestions for signatures that you’d like to see added to the signature collection so please get in touch on epilog@ccl-forensics.com

For more information on epilog please visit our website – www.cclgroupltd.com/Buy-Software/

Parsing Apple System Log (ASL) files on iOS and OSX for Fun and Evidence (and a Python script to do it for you)

(If you’re dying to get stuck in and are only after the links to the Python scripts, they can be found at the bottom of the post!)

After every update to iOS I like to take a file system dump of one of our test iDevices and have a poke around to see what’s changed and what’s new. Recently, on one of my excursions around the iOS file system, I came across something that looked promising that I hadn’t dug into before: a bunch of files with the “.asl” extension which were located on the data partition in “log/DiagnosticMessages”. There were lots of them too –each with a file name referring to a particular date – they went back months!

DiagnosticMessages file listing

Log Files!

“Loads of lovely log files!” I thought to myself as I excitedly dropped one of the files into my current text editor of choice (Notepad++ if you’re interested) only to be disappointed by what was clearly a binary file format.

ASLDB File in a text editor

Curses!

So I headed over to Google and entered some hopeful sounding search queries and came across a very useful blog post (http://crucialsecurityblog.harris.com/2011/06/22/the-apple-system-log-%E2%80%93-part-1/) which described the role of ASL files on OSX and listed some ways for accessing the logs from within OSX, but I was interested in gaining a better understanding of the file format (besides, the nearest Mac to me was on a different floor!).

A little more digging revealed that the code that governed the ASL logging, and the files it generated were part of the Open Source section of OSX, as a result I was able to view the code that was actually responsible for creating the file – my luck was looking up!

The two files I was particularly interested in were “asl.h” (most recent version at time of posting: http://opensource.apple.com/source/Libc/Libc-763.13/include/asl.h) and “asl_file.h” (most recent version at time of posting: http://opensource.apple.com/source/Libc/Libc-763.13/gen/asl_file.h). C header files are great; basically, their purpose is to define the data structures that are subsequently used in the functional code, so when it comes to understanding file formats, quite often they’ll tell you all you need to know without having to try and follow the flow of the actual program. Better yet, these files were pretty well commented. I know that not everyone reading this is going to want to read through a bunch of C code, so I’ll summarise the file format below (all numeric data is big endian):

First, the file Header:

Offset Length Data Type Description
0 12 String “ASL DB” followed by 6 bytes of 0x00
12 4 32bit Integer File version (current version is: 2)
16 8 64bit Integer File offset for the  first record in the file
24 8 64bit Integer Unix seconds timestamp, appears to be a file creation time
32 4 32bit Integer String cache size (not 100% sure what this refers to, may be maximum size for string entries in the records)
36 8 64bit Integer File offset for the last record in the file
44 36 Padding Should all be 0x00 bytes

So nothing too ominous there, although all of those pad-bytes at the end of the header suggest redundancy in the file spec in case apple ever fancy changing something. Indeed the fact that the header tells us that we’re on version 2 of the file format suggests that this has already happened.

The records in the file are arranged in a “doubly linked list”, that is, that every record in the file contains a reference (ie. the file offset of) the next and previous records.  From a high level, the records themselves are made up of a fixed length data section, followed by a variable length section which allows the storage of additional data in a key-value type structure, finally followed by the offset of the previous record. The table below explains the structure in detail.

NB: The string storage mechanism the records use is a little bit…interesting – I’ll explain in detail later in this post, but for now if you see a reference to an “ASL String”, I mean one of these “interesting” strings!

Offset Length Data Type Description
0 2 Padding 0x00 0x00
2 4 32bit Integer Length of this record (excluding this and the previous field)
6 8 64bit Integer File offset for next record
14 8 64bit Integer Numeric ID for this record
22 8 64bit Integer Record timestamp (as a Unix seconds timestamp)
30 4 32bit Integer Additional nanoseconds for timestamp
34 2 16bit Integer Level (see below)
36 2 16bit Integer Flags
38 4 32bit Integer Process ID that sent the log message
42 4 32bit Integer UID that sent the log message
46 4 32bit Integer GID that sent the log message
50 4 32bit Integer User read access
54 4 32bit Integer Group read access
58 4 32bit Integer Reference PID (for processes under the control of launchd)
62 4 32bit Integer Key-Value count: The total number of keys and values in the key-value storage of the record
66 8 ASL String Host that the sender belongs to (usually the name of the device)
74 8 ASL String Name of the sender (process) which send the log message
82 8 ASL String The sender’s facility
90 8 ASL String Log Message
98 8 ASL String The name of the reference process (for processes under control of launchd)
106 8 ASL String The session of the sender (set by launchd)
114 8 * Key-Value count ASL String[Key-Value count] The key-value storage: A key followed by a value, followed by a key followed by a value… and so on. All keys and values are strings

The level field mentioned above will have a numerical value which refers to the levels shown below:

Level Meaning
0 Emergency
1 Alert
2 Critical
3 Error
4 Warning
5 Notice
6 Info
7 Debug

As mentioned, the “ASL String” data type is a little odd. The ASL fields above take up 8 bytes, if the most significant bit in the 8 bytes is set (ie is 1), the rest of the most significant byte gives the length of the string, which occupies the remaining 7 bytes (unused bytes are set to 0x00). Conversely, if the top bit in the ASL String data type is not set (ie. Is 0) the entire 8 bytes should be interpreted as a 64bit Integer which gives the file offset where the string can be found. The string will be stored thusly:

Offset Length Data Type Meaning
0 2 Padding Padding bytes 0x00 0x01
2 4 32bit Integer String length
6 String length UTF8 String (nul-terminated) The string data

In order to get a better grip of what can be held in these files I decided to create a Python module to read these files and used it to dump out the contents of the ASL files I found on the iPhone.

Running the script

Running the script

Output from the script (iOS)

A snippet of the output produced by processing an iPhone’s ‘DiagnosticMessages’ folder

The first thing that struck me after running the script was the volume of messages: 16161 log messages spanning 10 months – and this was on a test handset which had lay idle for weeks at a time. The second thing was the prevalence of messages sent by the “powerd” daemon, over 87% of the messages had been sent by this process. The vast majority of these messages related to the device waking and sleeping – not through user interaction, but while the device was idle. Most of these “Wake” events occurred 2-5 minute apart, presumably to allow brief data connectivity to receive updates and push messages from apps.

Output from the script (iOS powerd messages)

Some powerd Wake and Sleep messages

The key thing that interested me about these messages was that they also noted the current battery-charge percentage in their text: this is the sort of data that just begs to be graphed, so I knocked up a little script which utilised the parsing module I had just written to extract just this data and present it in a graph-friendly manner.

Graph Friendly powerd Data

Graph Friendly Data

After graphing it (you want to use a scatter graph in Excel for this, not line as I discovered after some shouting at my screen) you are left with a graph which gives you some insight into the device’s use.

iOS Battery Use Graph

Some iPhone Power Usage (click for full-size)

The graph above shows around 3 weeks of battery usage data from the test handset. As noted previously, this test device would lay idle for days at a time (as suggested by the gentle downward gradients) but there were periods when the handset was in use, as shown by the steeper downward gradients on the 26th and 27th of April, which mostly took place within office hours. You can also clear see the points where the device was plugged in to be charged, suggested by the very steep upward gradients. As noted the power messages occur around ever 2-5 minutes, so the resolution is actually fairly good. The exception to this is while the device is plugged in as it no longer needs to sleep to preserve battery charge; typically I only saw an event when charging began and another when the device was unplugged and the battery began to discharge again.

There are a few other messages in the iOS ASL log that look interesting, but at this time I don’t have enough nice control data to make much of them. One thing that did hearten me somewhat was the fact that on the few extractions I’ve had the opportunity to take a look at from later revisions of iOS 5, there did seem to be some extra processes that were logging messages, so it’s my hope that we’ll see more and more useful data make its way into the ASL logs on iOS.

In addition to looking at iOS ASL files, I thought I’d take a look at some from an OSX installation. Pulling the logs from the “var/log/asl” on Lion (10.7.3) and running the parsing script across the whole directory brought back a far more varied selection of messages.

Output from the script (OSX)

Variety is the spice of life.

The number of records returned was actually far less than on iOS, partially due to the iOS “powerd” being so chatty, but more crucially because OSX tidies up its logs on a weekly basis. That’s not to say that you will only recover a week’s worth of logs though – on this test machine I recovered logs spanning 7 months. Rather, OSX has short-term log files (those with file names which begin with a timestamp) which have a shelf-life of a week and long term log files (those with file names which begin with “bb” followed by a timestamp). The “bb” in the long term log’s file name presumably stands for “best before” and the date, which is always in the future, is the date that the file should be cleared out. The short term log files tend to hold more “intimate” entries, often debug messages sent from 3rd party applications; the long term logs err more on the side of system messages. One particularly useful set of messages in the long term log are records pertaining to booting, shutting down, logins and logouts (hibernating, waking and failed logins are recorded too, but they end up in the short-term logs).

(As an aside:  one of my favourite things that I discovered when looking through these logs was the action of waking a laptop running OSX by wiggling a finger on the trackpad is recorded in the logs as a “HID Tickle”. Lovely.)

Like I did with the iOS power profiling, I put together a script which extracted these login and power records and timelines them.

OSX Login and power timeline

Login and power timeline

A couple of things worth noting beyond the basic boot/shutdown/login records: firstly when the device wakes it records why it happened – this can be quite specific: a USB device, the lid of a laptop being opened, the power button being pressed, etc. Secondly, you can see terminal windows (tty) being opened and closed as opening a terminal window involves logging in to a terminal session (OSX does this transparently, but it’s still logged).

We’ve released the scripts mentioned in this post to the community and they can be downloaded from https://code.google.com/p/ccl-asl/. The “ccl_asl” script is both a command line utility for dumping the contents of ASL files as well as a fully featured class module which you can use to write scripts along the lines of the battery profiler and login timeline scripts.

ASL files are, on the one hand, fairly dry system logs, but on the other, with a little work you can harvest some really insightful behavioural intelligence. As always if you have any questions, comments or suggestions you can contact us on research@ccl-forensics.com or leave a comment below.

Alex Caithness

The Forensic Implications of SQLite’s Write Ahead Log

By Alex Caithness, CCL-Forensics

SQLite is a popular free file-based database format which is used extensively both on desktop and mobile operating systems (it is one of the standard storage formats available on both Android and iOS). This article sets out to examine the forensic implications, both pitfalls and opportunities, of a relatively new feature of the database engine: Write Ahead Log.

Before we begin, it is worth taking a moment to describe the SQLite file format. Briefly, records in the database are stored in file which in SQLite parlance is called the ‘Database Image’. The database image is broken up into “pages” of a fixed size (the size is specified in the file header). Each page may have one of a number of roles, such as informing the structure of the database, and crucially holding the record data itself. The pages are numbered internally by SQLite starting from 1.

Historically SQLite used a mechanism called “Rollback Journals” for dealing with errors occurring during use of the database. Whenever any data on a page of the database was to be altered, the entire page was backed up in a separate journal file. At the conclusion of a successful transaction the journal file would be removed; conversely if the transaction was interrupted for any reason (crash, power cut, etc.) the journal remained. This means that if SQLite accesses a database and finds that a journal is still present something must have gone wrong and the engine will restore the database to its previous state using the copies of pages in the journal, avoiding corrupted data.

From version 3.7.0 of the SQLite engine an alternative journal mechanism was introduced called “Write Ahead Log” (ubiquitously shortened to “WAL”). WAL effectively turned the journal mechanism on its head: rather than backing up the original pages then making changes directly to the database file, the database file itself is untouched and the new or altered pages are written to a separate file (the Write Ahead Log). These altered or new pages will remain in the WAL file, the database engine reading data from the WAL in place of the historic version in the main database. This continues until a “Checkpoint” event takes place, finally copying the pages in the WAL file into the main database file. A Checkpoint may take place automatically when the WAL file reaches a certain size (by default this is 1000 pages) or performed manually by issuing an SQL command (“PRAGMA wal_checkpoint;”) or programmatically if an application has access to the SQLite engine’s internal API.

Initial state: No pages in the WAL

Initial state: No pages in the WAL

Page Altered - new version written to WAL

Page 3 is altered. The new version of the page is written to the WAL and the database engine uses this new version rather than the old version in the database file itself.

Checkpoint

A checkpoint operation takes place and the new version of the page is written into the database file.

It is possible to detect whether a database is in WAL mode in a number of ways: firstly this information is found in the database file’s header; examining the file in a hex editor, the bytes at file offset 18 and 19 will both be 0x01 if the database is using the legacy rollback journal or 0x02 if the database is in WAL mode. Secondly you can issue the SQL command “PRAGMA journal_mode;” which will return the value “wal” if the database is in WAL mode (anything else indicates rollback journal). However, probably the most obvious indication of a database in WAL mode is the presence of two files named as “<databasefilename>-wal” and “<databasefilename>-shm” in the same logical directory as the database (eg. if the database was called “sms.db” the two additional files would be “sms.db-wal” and “sms.db-shm”).

The “-wal” file is the actual Write Ahead Log which contains the new and updated database pages, its structure is actually fairly simplistic. The “-wal” file is made up of a 32 byte file header followed by zero or more “WAL frames”. The file header contains the following data:

Offset Size Description
0 4 bytes File signature (0x377F0682 or 0x377F0683)
4 4 bytes File format version (currently 0x002DE218 which interpreted as a big endian integer is 3007000)
8 4 bytes Associated database’s page size (32-bit big endian integer)
12 4 bytes Checkpoint sequence number (32-bit big endian integer which is incremented with every checkpoint, starting at 0)
16 4 bytes Salt-1 Random number, incremented with every checkpoint *
20 4 bytes Salt-2 Random number, regenerated with every checkpoint
24 4 bytes Checksum part 1 (for the first 24 bytes of the file)
28 4 bytes Checksum part 2 (for the first 24 bytes of the file)

* In testing it was found that although the official (and at the time of writing, up to date) command line version of SQLite v3.7.11 behaved correctly, when using SQLite Expert v3.2.2.2.2102 this value appeared to be regenerated after each checkpoint (which is assumed by the author to be incorrect behaviour)

The WAL Frames that follow the header consist of a 24 byte header followed by the number of bytes specified in the file header’s “page size” field which is the new or altered database page. The Frame Header takes the following form:

Offset Size Description
0 4 bytes Database page number (32-bit big endian integer)
4 4 bytes For a record that marks the end of a transaction (a commit record) this will be a 32-bit big endian integer giving the size of the database file in pages, otherwise 0.
8 4 bytes Salt-1, as found in the WAL header at the time that this Frame was written
12 4 bytes Salt-2, as found in the WAL header at the time that this Frame was written
16 4 bytes Checksum part 1 – cumulative checksum up through and including this page
20 4 bytes Checksum part 2 – cumulative checksum up through and including this page

There are a number of potential uses and abuses for the WAL file in the context of digital forensics, but first, the behaviour of SQLite while in WAL mode should examined. A number of operations were performed on a SQLite database in WAL mode. After each operation the database file along with its “-shm” and “-wal” files were copied, audited and hashed so that their states could be examined.

Step 1: Create empty database with a single table:

8a9938bc7252c3ab9cc3da64a0e0e06a *database.db
b5ad3398bf9e32f1fa3cca9036290774 *database.db-shm
da1a0a1519d973f4ab7935cec399ba58 *database.db-wal

1,024       database.db
32,768      database.db-shm
2,128       database.db-wal

WAL Checkpoint Number:  0
WAL Salt-1: 3046154441
WAL Salt-2: 220701676

Viewing the database file using a hex editor we find a single page containing the file header and nothing else. As noted as well as creating a database file, a table was also created, however this data was written to the WAL in the form of a new version of this page. The WAL contains two frames, this new version of the first page in addition to a second frame holding an empty table page. When accessing this database through the SQLite engine this information is read from the “-wal” file transparently and we see the empty table, even though the data doesn’t appear in the database file itself.

Step 2: Force a checkpoint using PRAGMA command:

dd376606c00867dc34532a44aeb0edb6 *database.db
1878dbcefc552cb1230fce65df13b8c7 *database.db-shm
da1a0a1519d973f4ab7935cec399ba58 *database.db-wal

2,048       database.db
32,768      database.db-shm
2,128       database.db-wal

WAL Checkpoint Number:  0
WAL Salt-1: 3046154441
WAL Salt-2: 220701676

Using the pragma command mentioned above, the database was “checkpointed”. Accessing the database through SQLite we see no difference to the data but examining the files involved, we can clearly see that the database file has changed (it has different hash) furthermore it has grown. Looking inside the database file we can see the two pages from the “-wal” file have now been written into the database file itself and SQLite will be reading this data from here rather than the “-wal” file.

The WAL Checkpoint number and salts were not changed at this point, as we will see they are altered the next time that the WAL is written to.

Another interesting observation is that the “-wal” file was left completely unchanged during the checkpoint process – a fact that will become extremely important in the next step.

Step 3: Insert a single row:

dd376606c00867dc34532a44aeb0edb6 *database.db
6dc09958989a6c0094a99a66531f126f *database.db-shm
e9fc939269dbdbfbc157d8c12be720ed *database.db-wal

2,048       database.db
32,768      database.db-shm
2,128       database.db-wal

WAL Checkpoint Number:  1
WAL Salt-1: 3046154442
WAL Salt-2: 534753839

A single row was inserted into the database using a SQL INSERT statement. Once again we arrive at a situation where the database file itself has been left untouched, evidenced by the fact that the database file’s hash hasn’t altered since the last step.

The “-wal” file hasn’t changed size (so still contains two WAL frames) but clearly the contents of the file have changed. Indeed, examining the file in a hex editor we find that the first frame in the file contains a table page containing the newly inserted record as we would expect. What is interesting is that the second frame in the file is the same second frame found in the file in the previous two steps. After a checkpoint the “-wal” file is not deleted or truncated, it is simply reused, frames being overwritten from the top of the file.

Examining the Frame’s headers we see the following:

Frame Page Number Commit Size Salt-1 Salt-2
1 2 2 3046154442 534753839
2 2 2 3046154441 220701676

Both frames relate to the same page in the database but their salt values differ. As previously noted these two salt values are copied from the WAL file header as they are at the time of writing. Salt-2 is regenerated upon each checkpoint, but key here is Salt-1 which is initialised when the WAL is first created and then incremented upon each checkpoint. Using this value we can show that the page held in second frame of the WAL is a previous version of page held in the first frame: we can begin to demonstrate a timeline of changes to the database.

Step 4: Force a checkpoint using PRAGMA command:

704c633fdceceb34f215cd7fe17f0e84 *database.db
a98ab9ed82393b728a91aacc90b1d788 *database.db-shm
e9fc939269dbdbfbc157d8c12be720ed *database.db-wal

2,048       database.db
32,768      database.db-shm
2,128       database.db-wal

WAL Checkpoint Number:  1
WAL Salt-1: 3046154442
WAL Salt-2: 534753839

Once again a checkpoint was forced using the PRAGMA command.  As before the updated pages in the WAL were written into the database file and this operation had no effect on the contents of the “-wal” itself. Viewing the database using the SQLite engine shows the same data as in the previous step.

Step 5: Insert a second row, Update contents of the first row:

704c633fdceceb34f215cd7fe17f0e84 *database.db
d17cf8f25deaa8dbf4811b4d21216506 *database.db-shm
ed5f0336c23aef476c656dd263849dd0 *database.db-wal

2,048       database.db
32,768      database.db-shm
2,128       database.db-wal

WAL Checkpoint Number:  2
WAL Salt-1: 3046154443
WAL Salt-2: 3543470737

A second row was added to the database using a SQL INSERT statement and the previously added row was altered using an UPDATE statement.

Once again, and as is now fully expected, the database file is unchanged, the new data has been written to the WAL. The WAL contains two frames: The first holds a table page containing the original record along with our newly added second record; the second frame holds a table page containing the updated version of our original record along with the new, second record. Examining the frame headers we see the following:

Frame Page Number Commit Size Salt-1 Salt-2
1 2 2 3046154443 3543470737
2 2 2 3046154443 3543470737

In this case both frames contain data belonging to the same page in the database and the same checkpoint (Salt-1 is the same for both frames); in this case the order of events is simply detected by the order in which the frames appear in the file – they are written to the file from the top, down.

Step 6: Insert a third row:

704c633fdceceb34f215cd7fe17f0e84 *database.db
5ac6d9e56e6bbb15981645cc6b4b4d6b *database.db-shm
672a97935722024aff4f1e2cf43d83ad *database.db-wal

2,048       database.db
32,768      database.db-shm
3,176       database.db-wal

WAL Checkpoint Number:  2
WAL Salt-1: 3046154443
WAL Salt-2: 3543470737

Next, a third row was added to the database using an INSERT statement. Viewing the database logicaly with the SQLite engine we see all three records. While database file remains unchanged, the “-wal” file now contains 3 frames: the first two are as in the previous step with the third and final new frame holding a table page with all three records. The frame headers contain the following information:

Frame Page Number Commit Size Salt-1 Salt-2
1 2 2 3046154443 3543470737
2 2 2 3046154443 3543470737
3 2 2 3046154443 3543470737

We now have three versions of the same page, as before the sequence of events is denoted by the order they occur in the file.

Step 7: Force a checkpoint using PRAGMA command:

04a16e75245601651853fd0457a4975c *database.db
05be4054f8e33505cc2cd7d98c9e7b31 *database.db-shm
672a97935722024aff4f1e2cf43d83ad *database.db-wal

2,048       database.db
32,768      database.db-shm
3,176       database.db-wal

WAL Checkpoint Number:  2
WAL Salt-1: 3046154443
WAL Salt-2: 3543470737

As we have observed before the checkpoint results in to the up-to-date records being written into the database, the “-wal” file is unaffected.

Step 8: Delete A Row:

04a16e75245601651853fd0457a4975c *database.db
dca5c61a689fe73b3c395fd857a9795a *database.db-shm
3b518081a5ab4a7be6449e86bb9c2589 *database.db-wal

2,048       database.db
32,768      database.db-shm
3,176       database.db-wal

WAL Checkpoint Number:  3
WAL Salt-1: 3046154444
WAL Salt-2: 2798791151

Finally in this test, the second record in the table (the record added in Step 5) was deleted using an SQL DELETE statement. Accessing the database using the SQLite engine shows that the record is no longer live in the database.

As per expectations the database file is unaffected by this operation, the changes instead being written to the WAL. The “-wal” file contains three frames:  the first frame holds a table page with the second record deleted (the data can still be seen, and could be recovered using a tool such as Epilog, however the metadata on the page shows that the record is not live). The remaining two pages are identical to the final two frames in the previous step. Examining the frame headers we see the following:

Frame Page Number Commit Size Salt-1 Salt-2
1 2 2 3046154444 2798791151
2 2 2 3046154443 3543470737
3 2 2 3046154443 3543470737

Here we once again see three frames all containing data from the same database page, this time the most recent version of the page is found in frame 1 as it has the highest Salt-1 value; the other two frames have a lower Salt-1 value and are therefore older revisions; as they both share the same Salt-1 value we apply the “position in file” rule, the later in the file the frame occurs, the newer it is. So in order of newest to oldest the frames are ordered: 1, 3, 2.

Summarising the findings in this experiment:

  • Altered or new pages are written to the WAL a frame at a time, rather than the database file
  • The most up-to-date pages in the WAL are written to the database file on a Checkpoint event – this operation leaves the “-wal” file untouched
  • After a Checkpoint, the “-wal” file is reused rather than deleted or truncated,  with new frames
  • Multiple frames for the same database page can exist in the WAL, their relative ages can be derived by first examining the frame header’s Salt-1 value with newer frames having higher values. Where multiple frames have the same Salt-1, their age is determined by their order in the WAL, with newer frames occurring later

Pitfalls and Opportunities

The most obvious opportunity afforded by the Write Ahead Log is the potential for time-lining of activity in database. To prove the concept, a small Python script was written which would automate the analysis of the frames in a WAL file and provide a chronology of the data; a sample output is shown below:

Header Info:
    Page Size: 1024
    Checkpoint Sequence: 3
    Salt-1: 3046154444
    Salt-2: 2798791151

Reading frames...

Frame 1 (offset 32)
    Page Number: 2
    Commit Size: 2
    Salt-1: 3046154444
    Salt-2: 2798791151

Frame 2 (offset 1080)
    Page Number: 2
    Commit Size: 2
    Salt-1: 3046154443
    Salt-2: 3543470737

Frame 3 (offset 2128)
    Page Number: 2
    Commit Size: 2
    Salt-1: 3046154443
    Salt-2: 3543470737

Unique Salt-1 values:
    3046154443
    3046154444

Chronology of frames (oldest first):

Page Number: 2
    Frame 2
    Frame 3
    Frame 1

With further work it should be possible to display a sequence of insertions, updates and deletions of records within a database – a feature which is a top priority for the next update of Epilog. Even without the ability to timeline, it is clear that deleted records can be stored and recovered from the WAL (functionality already present in Epilog).

One behaviour which hasn’t been described in full so far is that a database file in WAL mode isolated from its associated “-wal” file is, in almost all circumstances, a valid database in its own right. For example, consider the test database above as it is at the end of Step 8. If the database file was moved to another directory, as far as the SQLite database engine is concerned this is a complete database file. If this isolated database file was queried, the data returned will be that which was present at the last checkpoint (in our test case, this would be the 3 live records present at the checkpoint performed in step 7).

This raises an important consideration when working with a SQLite database contained in a disk image or other container (eg. a TAR archive): if the database file is extracted from the image or container without its associated WAL files, the data can be out-of-date or incomplete. The other side of the coin is that the “full up-to-date” version of the data (viewed with the WAL present) may lack records present in the isolated database file because of deletions pending a checkpoint. There is, then, an argument for examining databases both ways: complete with WAL files and isolated as it may be possible to obtain deleted records “for free”.

Summing Up

The Write Ahead Log introduced in SQLite 3.7 may afford digital forensics practitioners new opportunities to extract extra data and behaviour information from SQLite databases; however the mechanism should be understood to get the most of the new opportunities and avoid confusion when working with the databases.

If you have any comments or questions, please leave a comment below or get in touch directly at research@ccl-forensics.com.

References:

SQLite File Format

Write Ahead Log

Alex Caithness, CCL-Forensics

Digital forensic software – grab it while it’s hot!

CCL-Forensics is offering its software at introductory prices for just one more week, so take a look at what’s on offer and squeeze as much into tight budgets as you can.

The tried-and-tested software, developed by analysts, for analysts, has been used extensively in the field by CCL-Forensics’ own investigators and by many other digital investigators from around the world.

From March 31st, prices will be increasing, so take advantage of the lower rates now.

Leading research and development in digital forensics

CCL-Forensics’ research and development team has produced a series of forensic software tools to aid them in digital investigations.

epilog allows investigators to recover deleted data from the widely-used database format, SQLite. Whatever the type of device – computers, mobile phones, SatNavs or others – epilog can be used to recover information, regardless of the type of data stored.

PIP allows analysts to present often-complex data from XML files quickly and efficiently. The tool also parses data from Apple’s property list (plist) files – both in XML and binary format. It can be used to look at computers, mobile phones and SatNavs.

dunk! can uncover potential new web activity evidence from locally-stored web cookies, putting web evidence into context and adding an extra dimension to investigations. It also parses Google Analytics cookies, showing how often, from where, and how a user arrived at a particular site, as well as presenting any search terms used to find the page.

Find out more

For more information about what CCL-Forensics can offer or to purchase the software tools, please visit our website, call us on 01789 261200 or email info@ccl-forensics.com.

Forensic software tools – get ‘em while they’re hot, they’re lovely!

The R&D team at CCL-Forensics are a busy bunch. Over the past couple of years, they’ve developed a number of forensic software tools to examine the evidence that standard tools can’t reach.

Here’s a quick overview of what’s on offer. Follow the links to find out more, or give us a shout by phone (01789 261200) or email (info@ccl-forensics.com) – we’re always happy to talk geek with like-minded practitioners.

epilog allows investigators to recover deleted data from SQLite databases, a widely-used format in many devices including mobile phones, computers and SatNavs). Many off-the-shelf tools will only allow you to view live records.

PIP is our XML and plist parsing tool. It allows investigators to present often-complex data from XML files quickly, efficiently, and in a user-friendly format. Apple’s property list files – both XML and binary formats – present no obstacle to PIP at all.

dunk! is a splendidly-named tool for digging around in cookies. Unlike standard tools, it analyses known cookie types to uncover potential new evidence and help give context to other browser artefacts. This includes showing the path the user took to arrive at a particular webpage by parsing Google Analytics cookies, revealing a wealth of information previously unavailable to practitioners.

rubus  is FREE! We like to give a little love back to the community, so with this in mind, we made our BlackBerry backup deconstruction tool available. Not having found a tool that would do the job, we made our own – enabling analysts to reverse engineer BlackBerry backup data stored in .ipd files.

The tools all went through beta-testing first, and were pronounced ready to unleash upon the world. Since then, they’ve been subject to an introductory pricing period, and have been bought and used successfully around the world.

Now that we’re confident in the tools we’ve developed, we’re also confident in their value to our customers. So with that in mind, if you haven’t bought the tools already, you may want to think about doing so! The introductory pricing period finishes at the end of March – and although they’ll still be extremely good value for money, they will be a little more expensive.

We’ve had useful feedback from our customers in the past, which has helped us to further develop our tools, and we always welcome comments and suggestions on our software. Feel free to comment below, or get in touch with us in more traditional ways!

Recovering chat logs with epilog

Alex Caithness, epilog‘s developer and programmer, has put together a short video to demonstrate how epilog can recover chat log messages – in this case, from an Apple iPhone.

Epilog is available for free trial download from www.ccl-forensics.com.

Dunk your cookies in our software

Cookies are often seen as the poor cousin of digital evidence, but they can provide a wealth of information for digital investigators – including how often, from where and how a user visited a certain site – as well as the search terms used to find it.

So how can we access this treasure trove of knowledge?

By using a piece of software called dunk! which covers all the main PC internet browsers (Chrome, Firefox, Safari, IE, etc.) and a wide range of mobile browsers.

The inspiration for dunk! came after conducting an examination of an iPhone during which we found that evidence for the web history and cache was thin on the ground – although we were getting some interesting key word hits in the cookies.

Previously, analysts had been dumping cookies into a straightforward table view, but not looking at the structure of the cookies’ values. However, in this case all the interesting key words fell inside what were found to be Google Analytics cookies. The nice thing about these cookies was that, unlike many cookies where the structure is proprietorial, these were consistent between all sites and contained really interesting insights into a user’s web activity.

We wrote a program enabling us to view all the cookies at once, and where known structures (such as Google Analytics) were found, automatically parse them – and we designed it to support as many browsers as possible.

But that’s not all it does; dunk! can detect session cookies which may contain usernames, email addresses, and sometimes even passwords, allowing investigators to build the fullest picture possible of browsing habits.

The interface allows the data to be filtered, searched and exported. In a nutshell, the software does the following:

  • Processes cookies from PCs and mobile devices
    • Internet Explorer 5+
    • Mozilla Firefox 3.x
    • Mozilla Firefox 4.0
    • Google Chrome
    • Safari browser
    • Opera 5+
    • Apple “binarycookies” format
    • Android browser
    • Flash cookies
    • Nokia 40 browser
  • Parses Google Analytics cookies
  • Parses Adobe Flash cookies
  • Enables investigators to search and filter evidence
  • Detects session cookies which may contain usernames, email addresses, etc.
  • Outputs to TSV and XML file formats

Open the cookie jar and take a detailed look at what’s inside.

Alex Caithness

Dunk! Developer