New Epilog Signature files released

Epilog Signature files allow users to add specific support for new databases they encounter and although they are designed so that Epilog’s users can create their own signatures when the need arises, CCL-Forensics are committed to updating and releasing a sets of signatures, pre-written and ready to use.

In this new release we have had a real focus on smartphones adding support for:
• iOS6
• Android 4.0 (Ice Cream Sandwich)
• Android 4.1 (Jelly Bean)
• Android 3rd Party Applications
• iOS 3rd Party Applications
• Skype

We always welcome suggestions for signatures that you’d like to see added to the signature collection so please get in touch on epilog@ccl-forensics.com

For more information on epilog please visit our website – www.cclgroupltd.com/Buy-Software/

Advertisements

The Forensic Implications of SQLite’s Write Ahead Log

By Alex Caithness, CCL-Forensics

SQLite is a popular free file-based database format which is used extensively both on desktop and mobile operating systems (it is one of the standard storage formats available on both Android and iOS). This article sets out to examine the forensic implications, both pitfalls and opportunities, of a relatively new feature of the database engine: Write Ahead Log.

Before we begin, it is worth taking a moment to describe the SQLite file format. Briefly, records in the database are stored in file which in SQLite parlance is called the ‘Database Image’. The database image is broken up into “pages” of a fixed size (the size is specified in the file header). Each page may have one of a number of roles, such as informing the structure of the database, and crucially holding the record data itself. The pages are numbered internally by SQLite starting from 1.

Historically SQLite used a mechanism called “Rollback Journals” for dealing with errors occurring during use of the database. Whenever any data on a page of the database was to be altered, the entire page was backed up in a separate journal file. At the conclusion of a successful transaction the journal file would be removed; conversely if the transaction was interrupted for any reason (crash, power cut, etc.) the journal remained. This means that if SQLite accesses a database and finds that a journal is still present something must have gone wrong and the engine will restore the database to its previous state using the copies of pages in the journal, avoiding corrupted data.

From version 3.7.0 of the SQLite engine an alternative journal mechanism was introduced called “Write Ahead Log” (ubiquitously shortened to “WAL”). WAL effectively turned the journal mechanism on its head: rather than backing up the original pages then making changes directly to the database file, the database file itself is untouched and the new or altered pages are written to a separate file (the Write Ahead Log). These altered or new pages will remain in the WAL file, the database engine reading data from the WAL in place of the historic version in the main database. This continues until a “Checkpoint” event takes place, finally copying the pages in the WAL file into the main database file. A Checkpoint may take place automatically when the WAL file reaches a certain size (by default this is 1000 pages) or performed manually by issuing an SQL command (“PRAGMA wal_checkpoint;”) or programmatically if an application has access to the SQLite engine’s internal API.

Initial state: No pages in the WAL

Initial state: No pages in the WAL

Page Altered - new version written to WAL

Page 3 is altered. The new version of the page is written to the WAL and the database engine uses this new version rather than the old version in the database file itself.

Checkpoint

A checkpoint operation takes place and the new version of the page is written into the database file.

It is possible to detect whether a database is in WAL mode in a number of ways: firstly this information is found in the database file’s header; examining the file in a hex editor, the bytes at file offset 18 and 19 will both be 0x01 if the database is using the legacy rollback journal or 0x02 if the database is in WAL mode. Secondly you can issue the SQL command “PRAGMA journal_mode;” which will return the value “wal” if the database is in WAL mode (anything else indicates rollback journal). However, probably the most obvious indication of a database in WAL mode is the presence of two files named as “<databasefilename>-wal” and “<databasefilename>-shm” in the same logical directory as the database (eg. if the database was called “sms.db” the two additional files would be “sms.db-wal” and “sms.db-shm”).

The “-wal” file is the actual Write Ahead Log which contains the new and updated database pages, its structure is actually fairly simplistic. The “-wal” file is made up of a 32 byte file header followed by zero or more “WAL frames”. The file header contains the following data:

Offset Size Description
0 4 bytes File signature (0x377F0682 or 0x377F0683)
4 4 bytes File format version (currently 0x002DE218 which interpreted as a big endian integer is 3007000)
8 4 bytes Associated database’s page size (32-bit big endian integer)
12 4 bytes Checkpoint sequence number (32-bit big endian integer which is incremented with every checkpoint, starting at 0)
16 4 bytes Salt-1 Random number, incremented with every checkpoint *
20 4 bytes Salt-2 Random number, regenerated with every checkpoint
24 4 bytes Checksum part 1 (for the first 24 bytes of the file)
28 4 bytes Checksum part 2 (for the first 24 bytes of the file)

* In testing it was found that although the official (and at the time of writing, up to date) command line version of SQLite v3.7.11 behaved correctly, when using SQLite Expert v3.2.2.2.2102 this value appeared to be regenerated after each checkpoint (which is assumed by the author to be incorrect behaviour)

The WAL Frames that follow the header consist of a 24 byte header followed by the number of bytes specified in the file header’s “page size” field which is the new or altered database page. The Frame Header takes the following form:

Offset Size Description
0 4 bytes Database page number (32-bit big endian integer)
4 4 bytes For a record that marks the end of a transaction (a commit record) this will be a 32-bit big endian integer giving the size of the database file in pages, otherwise 0.
8 4 bytes Salt-1, as found in the WAL header at the time that this Frame was written
12 4 bytes Salt-2, as found in the WAL header at the time that this Frame was written
16 4 bytes Checksum part 1 – cumulative checksum up through and including this page
20 4 bytes Checksum part 2 – cumulative checksum up through and including this page

There are a number of potential uses and abuses for the WAL file in the context of digital forensics, but first, the behaviour of SQLite while in WAL mode should examined. A number of operations were performed on a SQLite database in WAL mode. After each operation the database file along with its “-shm” and “-wal” files were copied, audited and hashed so that their states could be examined.

Step 1: Create empty database with a single table:

8a9938bc7252c3ab9cc3da64a0e0e06a *database.db
b5ad3398bf9e32f1fa3cca9036290774 *database.db-shm
da1a0a1519d973f4ab7935cec399ba58 *database.db-wal

1,024       database.db
32,768      database.db-shm
2,128       database.db-wal

WAL Checkpoint Number:  0
WAL Salt-1: 3046154441
WAL Salt-2: 220701676

Viewing the database file using a hex editor we find a single page containing the file header and nothing else. As noted as well as creating a database file, a table was also created, however this data was written to the WAL in the form of a new version of this page. The WAL contains two frames, this new version of the first page in addition to a second frame holding an empty table page. When accessing this database through the SQLite engine this information is read from the “-wal” file transparently and we see the empty table, even though the data doesn’t appear in the database file itself.

Step 2: Force a checkpoint using PRAGMA command:

dd376606c00867dc34532a44aeb0edb6 *database.db
1878dbcefc552cb1230fce65df13b8c7 *database.db-shm
da1a0a1519d973f4ab7935cec399ba58 *database.db-wal

2,048       database.db
32,768      database.db-shm
2,128       database.db-wal

WAL Checkpoint Number:  0
WAL Salt-1: 3046154441
WAL Salt-2: 220701676

Using the pragma command mentioned above, the database was “checkpointed”. Accessing the database through SQLite we see no difference to the data but examining the files involved, we can clearly see that the database file has changed (it has different hash) furthermore it has grown. Looking inside the database file we can see the two pages from the “-wal” file have now been written into the database file itself and SQLite will be reading this data from here rather than the “-wal” file.

The WAL Checkpoint number and salts were not changed at this point, as we will see they are altered the next time that the WAL is written to.

Another interesting observation is that the “-wal” file was left completely unchanged during the checkpoint process – a fact that will become extremely important in the next step.

Step 3: Insert a single row:

dd376606c00867dc34532a44aeb0edb6 *database.db
6dc09958989a6c0094a99a66531f126f *database.db-shm
e9fc939269dbdbfbc157d8c12be720ed *database.db-wal

2,048       database.db
32,768      database.db-shm
2,128       database.db-wal

WAL Checkpoint Number:  1
WAL Salt-1: 3046154442
WAL Salt-2: 534753839

A single row was inserted into the database using a SQL INSERT statement. Once again we arrive at a situation where the database file itself has been left untouched, evidenced by the fact that the database file’s hash hasn’t altered since the last step.

The “-wal” file hasn’t changed size (so still contains two WAL frames) but clearly the contents of the file have changed. Indeed, examining the file in a hex editor we find that the first frame in the file contains a table page containing the newly inserted record as we would expect. What is interesting is that the second frame in the file is the same second frame found in the file in the previous two steps. After a checkpoint the “-wal” file is not deleted or truncated, it is simply reused, frames being overwritten from the top of the file.

Examining the Frame’s headers we see the following:

Frame Page Number Commit Size Salt-1 Salt-2
1 2 2 3046154442 534753839
2 2 2 3046154441 220701676

Both frames relate to the same page in the database but their salt values differ. As previously noted these two salt values are copied from the WAL file header as they are at the time of writing. Salt-2 is regenerated upon each checkpoint, but key here is Salt-1 which is initialised when the WAL is first created and then incremented upon each checkpoint. Using this value we can show that the page held in second frame of the WAL is a previous version of page held in the first frame: we can begin to demonstrate a timeline of changes to the database.

Step 4: Force a checkpoint using PRAGMA command:

704c633fdceceb34f215cd7fe17f0e84 *database.db
a98ab9ed82393b728a91aacc90b1d788 *database.db-shm
e9fc939269dbdbfbc157d8c12be720ed *database.db-wal

2,048       database.db
32,768      database.db-shm
2,128       database.db-wal

WAL Checkpoint Number:  1
WAL Salt-1: 3046154442
WAL Salt-2: 534753839

Once again a checkpoint was forced using the PRAGMA command.  As before the updated pages in the WAL were written into the database file and this operation had no effect on the contents of the “-wal” itself. Viewing the database using the SQLite engine shows the same data as in the previous step.

Step 5: Insert a second row, Update contents of the first row:

704c633fdceceb34f215cd7fe17f0e84 *database.db
d17cf8f25deaa8dbf4811b4d21216506 *database.db-shm
ed5f0336c23aef476c656dd263849dd0 *database.db-wal

2,048       database.db
32,768      database.db-shm
2,128       database.db-wal

WAL Checkpoint Number:  2
WAL Salt-1: 3046154443
WAL Salt-2: 3543470737

A second row was added to the database using a SQL INSERT statement and the previously added row was altered using an UPDATE statement.

Once again, and as is now fully expected, the database file is unchanged, the new data has been written to the WAL. The WAL contains two frames: The first holds a table page containing the original record along with our newly added second record; the second frame holds a table page containing the updated version of our original record along with the new, second record. Examining the frame headers we see the following:

Frame Page Number Commit Size Salt-1 Salt-2
1 2 2 3046154443 3543470737
2 2 2 3046154443 3543470737

In this case both frames contain data belonging to the same page in the database and the same checkpoint (Salt-1 is the same for both frames); in this case the order of events is simply detected by the order in which the frames appear in the file – they are written to the file from the top, down.

Step 6: Insert a third row:

704c633fdceceb34f215cd7fe17f0e84 *database.db
5ac6d9e56e6bbb15981645cc6b4b4d6b *database.db-shm
672a97935722024aff4f1e2cf43d83ad *database.db-wal

2,048       database.db
32,768      database.db-shm
3,176       database.db-wal

WAL Checkpoint Number:  2
WAL Salt-1: 3046154443
WAL Salt-2: 3543470737

Next, a third row was added to the database using an INSERT statement. Viewing the database logicaly with the SQLite engine we see all three records. While database file remains unchanged, the “-wal” file now contains 3 frames: the first two are as in the previous step with the third and final new frame holding a table page with all three records. The frame headers contain the following information:

Frame Page Number Commit Size Salt-1 Salt-2
1 2 2 3046154443 3543470737
2 2 2 3046154443 3543470737
3 2 2 3046154443 3543470737

We now have three versions of the same page, as before the sequence of events is denoted by the order they occur in the file.

Step 7: Force a checkpoint using PRAGMA command:

04a16e75245601651853fd0457a4975c *database.db
05be4054f8e33505cc2cd7d98c9e7b31 *database.db-shm
672a97935722024aff4f1e2cf43d83ad *database.db-wal

2,048       database.db
32,768      database.db-shm
3,176       database.db-wal

WAL Checkpoint Number:  2
WAL Salt-1: 3046154443
WAL Salt-2: 3543470737

As we have observed before the checkpoint results in to the up-to-date records being written into the database, the “-wal” file is unaffected.

Step 8: Delete A Row:

04a16e75245601651853fd0457a4975c *database.db
dca5c61a689fe73b3c395fd857a9795a *database.db-shm
3b518081a5ab4a7be6449e86bb9c2589 *database.db-wal

2,048       database.db
32,768      database.db-shm
3,176       database.db-wal

WAL Checkpoint Number:  3
WAL Salt-1: 3046154444
WAL Salt-2: 2798791151

Finally in this test, the second record in the table (the record added in Step 5) was deleted using an SQL DELETE statement. Accessing the database using the SQLite engine shows that the record is no longer live in the database.

As per expectations the database file is unaffected by this operation, the changes instead being written to the WAL. The “-wal” file contains three frames:  the first frame holds a table page with the second record deleted (the data can still be seen, and could be recovered using a tool such as Epilog, however the metadata on the page shows that the record is not live). The remaining two pages are identical to the final two frames in the previous step. Examining the frame headers we see the following:

Frame Page Number Commit Size Salt-1 Salt-2
1 2 2 3046154444 2798791151
2 2 2 3046154443 3543470737
3 2 2 3046154443 3543470737

Here we once again see three frames all containing data from the same database page, this time the most recent version of the page is found in frame 1 as it has the highest Salt-1 value; the other two frames have a lower Salt-1 value and are therefore older revisions; as they both share the same Salt-1 value we apply the “position in file” rule, the later in the file the frame occurs, the newer it is. So in order of newest to oldest the frames are ordered: 1, 3, 2.

Summarising the findings in this experiment:

  • Altered or new pages are written to the WAL a frame at a time, rather than the database file
  • The most up-to-date pages in the WAL are written to the database file on a Checkpoint event – this operation leaves the “-wal” file untouched
  • After a Checkpoint, the “-wal” file is reused rather than deleted or truncated,  with new frames
  • Multiple frames for the same database page can exist in the WAL, their relative ages can be derived by first examining the frame header’s Salt-1 value with newer frames having higher values. Where multiple frames have the same Salt-1, their age is determined by their order in the WAL, with newer frames occurring later

Pitfalls and Opportunities

The most obvious opportunity afforded by the Write Ahead Log is the potential for time-lining of activity in database. To prove the concept, a small Python script was written which would automate the analysis of the frames in a WAL file and provide a chronology of the data; a sample output is shown below:

Header Info:
    Page Size: 1024
    Checkpoint Sequence: 3
    Salt-1: 3046154444
    Salt-2: 2798791151

Reading frames...

Frame 1 (offset 32)
    Page Number: 2
    Commit Size: 2
    Salt-1: 3046154444
    Salt-2: 2798791151

Frame 2 (offset 1080)
    Page Number: 2
    Commit Size: 2
    Salt-1: 3046154443
    Salt-2: 3543470737

Frame 3 (offset 2128)
    Page Number: 2
    Commit Size: 2
    Salt-1: 3046154443
    Salt-2: 3543470737

Unique Salt-1 values:
    3046154443
    3046154444

Chronology of frames (oldest first):

Page Number: 2
    Frame 2
    Frame 3
    Frame 1

With further work it should be possible to display a sequence of insertions, updates and deletions of records within a database – a feature which is a top priority for the next update of Epilog. Even without the ability to timeline, it is clear that deleted records can be stored and recovered from the WAL (functionality already present in Epilog).

One behaviour which hasn’t been described in full so far is that a database file in WAL mode isolated from its associated “-wal” file is, in almost all circumstances, a valid database in its own right. For example, consider the test database above as it is at the end of Step 8. If the database file was moved to another directory, as far as the SQLite database engine is concerned this is a complete database file. If this isolated database file was queried, the data returned will be that which was present at the last checkpoint (in our test case, this would be the 3 live records present at the checkpoint performed in step 7).

This raises an important consideration when working with a SQLite database contained in a disk image or other container (eg. a TAR archive): if the database file is extracted from the image or container without its associated WAL files, the data can be out-of-date or incomplete. The other side of the coin is that the “full up-to-date” version of the data (viewed with the WAL present) may lack records present in the isolated database file because of deletions pending a checkpoint. There is, then, an argument for examining databases both ways: complete with WAL files and isolated as it may be possible to obtain deleted records “for free”.

Summing Up

The Write Ahead Log introduced in SQLite 3.7 may afford digital forensics practitioners new opportunities to extract extra data and behaviour information from SQLite databases; however the mechanism should be understood to get the most of the new opportunities and avoid confusion when working with the databases.

If you have any comments or questions, please leave a comment below or get in touch directly at research@ccl-forensics.com.

References:

SQLite File Format

Write Ahead Log

Alex Caithness, CCL-Forensics

Cracking Android PINs and passwords

In a previous blog post we described a method to retrieve an Android pattern lock from the raw flash of a device. However, since version 2.2 (known as “Froyo”) Android has provided the option of a more traditional numeric PIN or alphanumeric password (both are required to be 4 to 16 digits or characters in length) as an alternative security measure.

The very act of writing the last blog got us thinking whether it was possible to use a similar approach to recovering the PINs and passwords.

Our first port of call was to return to the Android source code to confirm how the data was being stored (see listing 1). Both the numeric PIN and alphanumeric passwords were found to be processed by the same methods in the same way, both arriving as a text string containing the PIN or password.

As with the pattern lock the code is sensibly not stored in the plain, instead being hashed before it is stored. The hashed data (both SHA-1 and MD5 hash this time) are stored as an ASCII string in a file named password.key which can be found in the same location on the file system as our old friend gesture.key, in the /data/system folder.

However, unlike the pattern lock, the data is salted before being stored. This makes a dictionary attack unfeasible – but if we can reliably recover the salt it would still be possible to attempt a brute force attack.

/*
     * Generate a hash for the given password. To avoid brute force attacks, we use a salted hash.
     * Not the most secure, but it is at least a second level of protection. First level is that
     * the file is in a location only readable by the system process.
     * @param password the gesture pattern.
     * @return the hash of the pattern in a byte array.
     */
     public byte[] passwordToHash(String password) {
        if (password == null) {
            return null;
        }
        String algo = null;
        byte[] hashed = null;
        try {
            byte[] saltedPassword = (password + getSalt()).getBytes();
            byte[] sha1 = MessageDigest.getInstance(algo = "SHA-1").digest(saltedPassword);
            byte[] md5 = MessageDigest.getInstance(algo = "MD5").digest(saltedPassword);
            hashed = (toHex(sha1) + toHex(md5)).getBytes();
        } catch (NoSuchAlgorithmException e) {
            Log.w(TAG, "Failed to encode string because of missing algorithm: " + algo);
        }
        return hashed;

Listing 1

Source: com/android/internal/widget/LockPatternUtils.java.

The salt which is added to the data before hashing is a string of the hexadecimal representation of a random 64-bit integer. Necessarily, this number must then be stored, and the source code showed that the Android.Settings.Secure content provider was being used to store the value under the lockscreen.password_salt key.

On the Android file system the backing store for this content provider is found in an SQLite database settings.db in the /data/data/com.android.providers.settings/databases directory (see fig. 1).

Fig 1: Salt as stored in the settings.db SQLite database

Once we knew how these two essential pieces of data were being stored we were able to consider how they might be recovered from a raw flash dump. In the case of the hashes, our approach was similar to the pattern lock. Knowing that:

  • The dump was broken into chunks of 2048 bytes (2032 for storing the data, the remaining 16 used for YAFFS2 file system tags)
  • The passcword.key file contains two hashes encoded as an ASCII string:  an SHA-1 hash (40 hexadecimal digits long) followed by a MD5 hash (32 hexadecimal digits long) which would make the file 72 bytes long in total starting at the top of the chunk
  • The hashes only contain the characters 0-9 and A-F
  • The remaining 1960 bytes in the data portion of the chunk will be zero bytes

Recovering the salt required a little extra thought. The salt is stored in an SQLite database which, because of the structure of SQLite data, meant that it would be all-but-impossible to predict where in the chunk the data might be stored. Worse still, we couldn’t even be sure of the length of the data as it was stored as a string. However, having a deeper understanding of the SQLite file format allowed us to derive a lightweight and reliable way to recover the salt.

Fig 2: Raw data making up the Salt field in the settings.db database

Figure 2 shows the raw data for the salt record in the settings.db. An SQLite record is made up of two parts, a record header and the record data. The record header is made up of a sequence of values; the first value gives the length of the record header in bytes (including this value), the following values (serial type codes in SQLite parlance) define the data types which follow in the record data.

In our case our serial type codes represent two data types: a null and two strings. The null is unimaginatively represented by the zero-byte highlighted by the red box (if you take a look at Figure 1 you may notice that the first field is displayed as a numeric value 34; this column is defined in the schema as the type INTEGER PRIMARY KEY which means that the value is not actually stored in the record itself, hence being replaced by null. The reasons for this are out of the scope of this blog post, but if you’re particularly interested Alex is more than happy to explain, at length, another time!).

The other two values (highlighted by yellow and green boxes respectively) define strings; in serial type codes a string is represented by a value larger than thirteen, the length of the string can be found by applying the following formula where x is the value of the serial type code:

(x – 13)/2

We can test this by considering the second field: we know that this field will always contain the string: “lockscreen.password_salt” which is 24 characters (and bytes) long. The serial type code associated with this data is the value highlighted with the yellow box: 0x3D which is 61 in decimal. Applying the formula: 61 – 13 gives us 48, divided by 2 gives us 24 which is the length of our string.

The field containing the salt is also a string, but its length can vary depending on the value held. We know from the source code that it is a 64 bit integer being stored which gives us a range of -9223372036854775808 to 9223372036854775808 which, allowing for the negative sign means that the value, stored as a string, takes between 1 and 20 characters. Reversing the formula, the second field’s serial type code must be odd and fall between 15 and 53 (0x0F and 0x35).

Using this information we can create a set of rules which should allow us to search for this record in the raw dump so that we can extract the salt:

  • Starting with the record header:
    • First field is always null (and therefore a zero byte)
    • The next field is always a string of length 24 (serial type code 0x3D)
    • The third field is always a string with length 1-20 (odd serial type codes 0x0F-0x35)
  • Moving on to the record data:
    • The first null field doesn’t feature – it’s implicit in the record header
    • The first field to actually appear will always be the string: ”lockscreen.password_salt”
    • The next field will be a string of a positive or negative integer.

This allows us to define a regular expression that should reliably capture this record:

\x00\x3d[\x0f\x11\x13\x15\x17\x19\x1b\x1d\x1f\x21\x23\x25\x27\x29\x2b\x2d\x2f\x31\x33\x35]lockscreen.password_salt-?\d{1,19}

Understanding the record structure also means that once we have captured the record we can ensure that we extract the whole salt value; we can simply read the appropriate serial type code and apply the formula to get the length of the salt’s string.

Satisfied that we could reliably extract the data we needed to recover the PINs or passcodes we crafted a couple of Python scripts – one to find and extract the data in the flash dump, and the other to brute force the hashes recovered (using the salt). On our test handset (an Xperia X10i running Android 2.3) we set a number of PINs and passcodes and found that even on a fairly modest workstation (Python’s hashing modules are gratifyingly efficient) PINs of up to 10 digits could be recovered within a few hours.

The passwords obviously have a much larger key-space so took longer, but the attack still seems feasible for shorter passwords and the script can easily be modified to only use Latin characters and digits rather than any other special characters or work from a password dictionary which could expedite the process.

Fig 3: Running the BruteForceAndroidPin script standalone

Once again we are happy to be releasing the scripts so that other practitioners can make use of this method. The scripts can be downloaded here: http://ccl-forensics.com/view_category/8_other-software-and-scripts.html.

Alex Caithness and Arun Prasannan of the R&D team

Unlocking Android Pattern Locks

What do you do if you have to examine an Android device which has a pattern lock enabled, but USB debugging is not initialised?

If a physical level acquisition can be performed, our technique can be used to retrieve the lock pattern from the device.

The lock pattern is entered by the user joining the points on a 3×3 grid in their chosen order. The pattern must involve a minimum of 4 points and each point can only be used once (used points can be crossed subsequently in the pattern but they will not be registered).

Each point is indexed internally by Android from 0 to 8. So the pattern in Figure 1 would be understood as “0 3 6 7 8 5 2”.

Figure 1


Storing the pattern “in the plain” would constitute a significant security flaw. Android instead stores an SHA-1 hash of the pattern interpreted as a string of bytes. In order to represent the pattern in our example, the byte string: 0x00030607080502 would be hashed to produce: 618b589aa98dfee743e7120913a0665a4a5e8317.

This behaviour can be confirmed by taking a look at the Android source code shown below:

/*
 * Generate an SHA-1 hash for the pattern. Not the most secure, but it is
 * at least a second level of protection. First level is that the file
 * is in a location only readable by the system process.
 * @param pattern the gesture pattern.
 * @return the hash of the pattern in a byte array.
 */
private static byte[] patternToHash(List pattern) {
    if (pattern == null) {
        return null;
    }

    final int patternSize = pattern.size();
    byte[] res = new byte[patternSize];
    for (int i = 0; i < patternSize; i++) {
        LockPatternView.Cell cell = pattern.get(i);
        res[i] = (byte) (cell.getRow() * 3 + cell.getColumn());
    }
    try {
        MessageDigest md = MessageDigest.getInstance("SHA-1");
        byte[] hash = md.digest(res);
        return hash;
    } catch (NoSuchAlgorithmException nsa) {
        return res;
    }
}

Source: com/android/internal/widget/LockPatternUtils.java.

The pattern lock hash is stored as a byte string in a file named gesture.key which can be found in the /data/system folder on the device’s file system.

Although storing the pattern as a hash is a sensible approach as it is a “one-way” operation from the original data to the hash, Android does not apply a salt to the original data; this, combined with the fact that there is a finite number of possible patterns, makes it fairly trivial to create a dictionary of all valid hashes. If you can gain access to the gesture.key file you can easily recover the original pattern using a dictionary attack – it would take seconds.

However; sensibly, the gesture.key file is stored in an area of the device’s file system which is not accessible in normal operation, requiring root access to the device in order to extract the file; of course if you have root access you also have access to every other file on the device as well, so recovering the pattern becomes far less useful. (http://www.oxygen-forensic.com/download/pressrelease/PR_FS2011_ENG_2011-10-31.pdf)

There are situations, though, where root access cannot be gained (for example, where USB debugging cannot be activated on the device) which spurred us on to investigate whether the pattern lock could be easily recovered from a raw read of the device’s storage, the likes of which could be made by using JTAG or chip-off techniques.

One possible solution would be to rebuild the entire file system from a raw dump, and from this identify the relevant file; however, time constraints ruled this out. Instead we decided to examine whether it was possible to identify the contents of the gesture.key file among the rest of the data in the dump. We set a known pattern lock on a test handset (an HTC Wildfire in this case) and acquired the contents of its flash memory by using JTAG.

Figure 3: JTAG test pads on an HTC Wildfire

Loading the dump into a hex viewer, we noted that it was organised into chunks of 2048 bytes: the first 2032 bytes are allocated for the data in files while it appears that the remaining 16 bytes are used to store metadata (tags) relating to the YAFFS2 file system.  We then searched for the SHA-1 hash which corresponds to our known lock pattern. This data was found to be stored as expected, occupying the first 20 bytes of the block; interestingly we also found that the remaining 2012 bytes were zero-bytes – an observation which was repeated in our other tests.

These observations left us with a set of rules which we believed would allow us to identify a gesture.key file in the raw dump without having to rebuild the entire file system:

  • The dump is organised into blocks of 2048 bytes each
  • The gesture.key file should contain a 20-byte long SHA-1 hash at the start of a block
  • The 20-byte sequence at the start of the block must appear in our dictionary of hashes
  • The following 2012 bytes should all be zero-bytes

Bearing these rules in mind, we wrote a simple Python script which reads a dump 2048 bytes at a time, disregarding blocks which do not fit our required pattern of 20 arbitrary bytes followed by 2012 zero-bytes. If a block does match these criteria, the script then attempts to look up the prospective hash (the first 20 bytes) in our pre-compiled dictionary which was enough to remove any false positives and recover the correct pattern.

One potential shortcoming of this technique is that in its current form it will also identify any previously set patterns which are still present in flash memory. Due to the nature of flash media and the YAFFS file system, which is still found to be used by the majority of Android devices, the gesture.key file does not get over-written in place. Rather the new version is written to a new block with a higher sequence number.

Despite this potential issue, we have used this technique a number of times since with great success and are happy to be making public the scripts for generating the dictionary and for searching the data for research purposes.

GenerateAndroidGestureRainbowTable.py creates an SQLite database file (AndriodLockScreenRainbow.sqlite) which contains a list of hashes for all possible patterns.

Android_GestureFinder.py takes one argument: the flash dump file, and expects AndriodLockScreenRainbow.sqlite to be present in the same folder.

C:\wildfire> Android_GestureFinder.py WILDFIRE_JTAG.bin

[0, 3, 6, 7, 8, 5, 2] 618b589aa98dfee743e7120913a0665a4a5e8317

The scripts can be downloaded from our website: /www.ccl-forensics.com/Software/other-software-a-scripts.html. We hope that researchers and analysts will find the technique useful, and we welcome any comments or suggestions.

Alex Caithness and Arun Prasannan of the R&D team

Grab deleted SMS from Androids

In this, the third in our series of brief videos about epilog, senior programmer Alex Caithness demonstrates how epilog can be used to present deleted SMS messages from a raw image acquired from the JTAG interface of an Android phone.

epilog is an SQLite forensic tool from CCL-Forensics, and is available for download as a free trial from our website.

For more information about epilog, please contact the team at epilog@ccl-forensics.com.

Android Ice Cream Sandwich Browser Cookies (and other artefacts)

The Android browser traditionally had data structures that were distinctly Android; but as Alex Caithness explains, there are signs of convergence with another of Google’s pet projects…

I should probably start by explaining that Android has a delightful habit of naming its operating systems after desserts. The upside of this is that it’s quirky; the downside is that cake consumption in the lab increases by a significant factor.

Hence the name “Ice Cream Sandwich”.

Across previous versions of Android, the cookie storage format has remained unchanged: they have been neatly stored in the browser’s “databases” folder in the “cookies” table of the “webview.db” SQLite database; this appears to have changed in version 4.0 of Android AKA Ice Cream Sandwich (ICS).

Firstly, what is peculiar is that the “webview.db” file still contains the legacy “cookies” table, however in testing this was never populated. Instead, a new database named “webviewCookiesChromium.db” is used to store cookie data.

The name of the file gives us a big clue to the nature of the file – we’re seeing a convergence between the Android browser and Chromium (the browser upon which Google Chrome is built). Investigating the database confirms this; the schema and structure of data in this new database is identical to that of Chrome’s.

The great news for Dunk! users is that they can go right ahead and use the Google Chrome decoder on this file to parse and extract the cookies held.

There is also a second cookies database present in ICS named “webviewCookiesChromiumPrivate.db”. This database contains cookies transmitted while an “Incognito Tab” (the private browsing feature) is being used. The structure is identical to the other database; however, when the incognito tab is closed the file is truncated to 0 bytes.*

Further evidence of this convergence towards Chrome comes from the cache structure which, like the cookies, has moved to the same structure as is found in Chrome. For more details, take a look at http://www.chromium.org/developers/design-documents/network-stack/disk-cache.

*Although further research is required we anticipate that epilog will be able to recover these records from a raw dump of the flash chip!

Alex Caithness

R&D Team

Updated signature files for epilog

CCL-Forensics’ developers are constantly adding new files to increase the capability of epilog, and the latest signature files are now available for download, free of charge.

These new signature files now contain support for Apple iOS5.

We also welcome suggestions for additions to future signature file releases. Please email us at epilog@ccl-forensics.com.

What is epilog?

For those who don’t know, epilog is a software tool which allows investigators to recover deleted data from the widely-used database format, SQLite. Take a look at the first of our epilog videos:

It was developed by CCL-Forensics and – put simply – it gives investigators access to more data which could prove crucial in an investigation.

Many devices (whether mobile phones, computers, satnavs or other devices) store data in the SQLite database format.

Data stored in this type of database can provide a huge evidential opportunity for investigators. Many “off-the-shelf” tools can be used to view the live records in the database, but epilog  extracts deleted and de-referenced data from the database files or across a disc image or hex dump.

epilog’s three recovery algorithms can be used on any SQLite database, regardless of the type of data stored. However, epilog signatures can be used to tailor its behaviour to a particular database.

Included with the initial release of epilog were signatures including:

  • Android (SMS, call logs, calendars, address book and others)
  • iPhone (SMS, emails, calendar, and others)
  • Smartphone third party applications (including Yahoo Messenger, eBuddy chat and others)
  • Safari (internet history and cache and others)
  • Mozilla (cookies, internet history, form data and others)
  • Chrome (internet history)