Cell site blog – Never mind the quality, feel the width

Thoughts and observations on how ‘more’ could mean ‘less’ in the presentation of cell site analysis.

By Matthew Tart, Cell Site Analyst

This month – we look at quality over quantity in cell site analysis – with particular emphasis on a recent example where a pile (literally) of maps could easily have left jurors’ heads spinning.  And cost the prosecution a considerable sum.

This month’s topic: Getting the balance right in cell site analysis 

This blog starts with a case we were involved in recently, involving a high profile crime with a number of defendants.  On this occasion we were working for the defence, but this story acts as a useful pointer for the prosecution by illustrating techniques that experts used by the Crown should – and should not – be doing.  We’ll focus on a method used by a large number of cell site analysts (but not ourselves) which is not necessarily robust or stand up to close scrutiny.

Q: What were the details of the case?

A: The prosecution were investigating the probability of a suspect being at a crime scene – a pub in an inner city location.  At the time of the crime, one of the suspects (we were working for that suspect’s defence solicitor in this case) made a phone call.  The call data records showed that this phone call was made on what we’ll call ‘cell A’, which was on a mast near the crime scene – but also near to his home address which was about 500m away.

The suspect’s alibi was that he was at home at the time of the crime and the phone call.

The prosecution’s outsourced expert carried out ‘spot samples’ (i.e. turned up at a location with a piece of equipment) at both the crime scene and the alibi location.  Their report showed a different cell serving at each location.  Cell A was shown as best serving at the crime scene – but not at the alibi location.

Q: So what did we do differently?

A: We carried out a much more extensive survey i.e. a drive survey at the home address and the surrounding area.  This was carried out with regard to the cell of interest (Cell A), and we used multiple pieces of equipment and repeatedly moved in and out of the area.  We found that cell A provided coverage north, south, east and west of both locations (crime and alibi scene), and based upon this, could not distinguish between the mobile phone being at either location.  The evidence was simply not strong enough to suggest one or the other.

Q:  So, were both sides saying something different?

A: Yes and no. Before the court date, the prosecution’s outsourced expert asked for a copy of our defence report, which we provided.  We then discussed the contents with the expert over the phone, who claimed that he wouldn’t expect cell A to provide coverage at the home address.  After looking at our evidence, he admitted that our assertion that the cell served at both addresses was actually the most valid interpretation of the evidence.  This is a worrying admission/u-turn to say the least.  This is despite his evidence not documenting that cell A also serves at that crucial home address.

Q:  The other side claimed that a different cell provided service at the home address.  Did your survey find that cell as well?

A: Yes, but we found four cells which served at the home address.  Cell A, the one the other side claimed – AND two others.

Q: How was this data presented by the prosecution’s expert?

A: In a rather cumbersome, and lengthy fashion, to say the least.  There were a number of suspects, and their report showed the same maps over and over – and over – again.  It showed the locations of interest, calls for varying time periods, and whether the cells used actually covered the locations.  This came to more than 100 (one hundred) maps.  All printed on A3 paper and bound into a daunting, unwieldy piece of physical evidence, which the jury would have to absorb.

I would defy even the most attentive juror to have easily made sense of this massive tome.  Notwithstanding the threatening size of the document, but all the pages were practically the same, or almost identical copies of other similar pages.  You simply wouldn’t be able to take it all in.  Especially as one wouldn’t expect jurors to be familiar with this type of evidence – making it all the more crucial to have it presented in a friendly form.

Q: What would we have done differently?

A: Firstly, not produced a huge weighty un-jury-friendly document. The best way of presenting this evidence (for which we would have had MUCH more survey data, having done more than carry out simple spot samples) would have been a series of two or three detailed maps which can be presented interactively at court with the relevant points being highlighted by the expert in the course of presenting the evidence.  These maps would have covered specifically the period of interest – and would have a secondary, financial, benefit.

By not producing hundreds of maps, we would have saved a considerable amount of time – and therefore cost.  We would estimate that producing this unmanageable number of maps and documents would have potentially cost tens of thousands of pounds. Our approach would almost certainly have been cheaper AND more robust.

Q:  So the lesson here is…

A: …to think about what you need to achieve, and the best way of doing it.  Don’t be held to ransom by an outsourced experts ‘way of doing something’.  Hopefully this example has shown two things.  One, that carrying out spot samples (as we’ve mentioned in previous blogs) may not be the most appropriate way of surveying.  And secondly, that the end product i.e. what the jury see and have to understand, can be something a little more sophisticated than a batch of similar-looking, repetitive – and quite frankly, uninspiring – maps and tables.  Technology has moved on.  So has cell site analysis.  And so has the presentation of evidence in court.

In terms of maps it is quality not quantity that delivers the most impactive conclusions in relation to the possible locations of a mobile phone.

For more information about this – or any aspect of cell site analysis, please contact Matthew Tart (or any of our other cell site analysts) on 01789 261200 or by emailing cellsite@ccl-forensics.com

Advertisements

Parsing Apple System Log (ASL) files on iOS and OSX for Fun and Evidence (and a Python script to do it for you)

(If you’re dying to get stuck in and are only after the links to the Python scripts, they can be found at the bottom of the post!)

After every update to iOS I like to take a file system dump of one of our test iDevices and have a poke around to see what’s changed and what’s new. Recently, on one of my excursions around the iOS file system, I came across something that looked promising that I hadn’t dug into before: a bunch of files with the “.asl” extension which were located on the data partition in “log/DiagnosticMessages”. There were lots of them too –each with a file name referring to a particular date – they went back months!

DiagnosticMessages file listing

Log Files!

“Loads of lovely log files!” I thought to myself as I excitedly dropped one of the files into my current text editor of choice (Notepad++ if you’re interested) only to be disappointed by what was clearly a binary file format.

ASLDB File in a text editor

Curses!

So I headed over to Google and entered some hopeful sounding search queries and came across a very useful blog post (http://crucialsecurityblog.harris.com/2011/06/22/the-apple-system-log-%E2%80%93-part-1/) which described the role of ASL files on OSX and listed some ways for accessing the logs from within OSX, but I was interested in gaining a better understanding of the file format (besides, the nearest Mac to me was on a different floor!).

A little more digging revealed that the code that governed the ASL logging, and the files it generated were part of the Open Source section of OSX, as a result I was able to view the code that was actually responsible for creating the file – my luck was looking up!

The two files I was particularly interested in were “asl.h” (most recent version at time of posting: http://opensource.apple.com/source/Libc/Libc-763.13/include/asl.h) and “asl_file.h” (most recent version at time of posting: http://opensource.apple.com/source/Libc/Libc-763.13/gen/asl_file.h). C header files are great; basically, their purpose is to define the data structures that are subsequently used in the functional code, so when it comes to understanding file formats, quite often they’ll tell you all you need to know without having to try and follow the flow of the actual program. Better yet, these files were pretty well commented. I know that not everyone reading this is going to want to read through a bunch of C code, so I’ll summarise the file format below (all numeric data is big endian):

First, the file Header:

Offset Length Data Type Description
0 12 String “ASL DB” followed by 6 bytes of 0x00
12 4 32bit Integer File version (current version is: 2)
16 8 64bit Integer File offset for the  first record in the file
24 8 64bit Integer Unix seconds timestamp, appears to be a file creation time
32 4 32bit Integer String cache size (not 100% sure what this refers to, may be maximum size for string entries in the records)
36 8 64bit Integer File offset for the last record in the file
44 36 Padding Should all be 0x00 bytes

So nothing too ominous there, although all of those pad-bytes at the end of the header suggest redundancy in the file spec in case apple ever fancy changing something. Indeed the fact that the header tells us that we’re on version 2 of the file format suggests that this has already happened.

The records in the file are arranged in a “doubly linked list”, that is, that every record in the file contains a reference (ie. the file offset of) the next and previous records.  From a high level, the records themselves are made up of a fixed length data section, followed by a variable length section which allows the storage of additional data in a key-value type structure, finally followed by the offset of the previous record. The table below explains the structure in detail.

NB: The string storage mechanism the records use is a little bit…interesting – I’ll explain in detail later in this post, but for now if you see a reference to an “ASL String”, I mean one of these “interesting” strings!

Offset Length Data Type Description
0 2 Padding 0x00 0x00
2 4 32bit Integer Length of this record (excluding this and the previous field)
6 8 64bit Integer File offset for next record
14 8 64bit Integer Numeric ID for this record
22 8 64bit Integer Record timestamp (as a Unix seconds timestamp)
30 4 32bit Integer Additional nanoseconds for timestamp
34 2 16bit Integer Level (see below)
36 2 16bit Integer Flags
38 4 32bit Integer Process ID that sent the log message
42 4 32bit Integer UID that sent the log message
46 4 32bit Integer GID that sent the log message
50 4 32bit Integer User read access
54 4 32bit Integer Group read access
58 4 32bit Integer Reference PID (for processes under the control of launchd)
62 4 32bit Integer Key-Value count: The total number of keys and values in the key-value storage of the record
66 8 ASL String Host that the sender belongs to (usually the name of the device)
74 8 ASL String Name of the sender (process) which send the log message
82 8 ASL String The sender’s facility
90 8 ASL String Log Message
98 8 ASL String The name of the reference process (for processes under control of launchd)
106 8 ASL String The session of the sender (set by launchd)
114 8 * Key-Value count ASL String[Key-Value count] The key-value storage: A key followed by a value, followed by a key followed by a value… and so on. All keys and values are strings

The level field mentioned above will have a numerical value which refers to the levels shown below:

Level Meaning
0 Emergency
1 Alert
2 Critical
3 Error
4 Warning
5 Notice
6 Info
7 Debug

As mentioned, the “ASL String” data type is a little odd. The ASL fields above take up 8 bytes, if the most significant bit in the 8 bytes is set (ie is 1), the rest of the most significant byte gives the length of the string, which occupies the remaining 7 bytes (unused bytes are set to 0x00). Conversely, if the top bit in the ASL String data type is not set (ie. Is 0) the entire 8 bytes should be interpreted as a 64bit Integer which gives the file offset where the string can be found. The string will be stored thusly:

Offset Length Data Type Meaning
0 2 Padding Padding bytes 0x00 0x01
2 4 32bit Integer String length
6 String length UTF8 String (nul-terminated) The string data

In order to get a better grip of what can be held in these files I decided to create a Python module to read these files and used it to dump out the contents of the ASL files I found on the iPhone.

Running the script

Running the script

Output from the script (iOS)

A snippet of the output produced by processing an iPhone’s ‘DiagnosticMessages’ folder

The first thing that struck me after running the script was the volume of messages: 16161 log messages spanning 10 months – and this was on a test handset which had lay idle for weeks at a time. The second thing was the prevalence of messages sent by the “powerd” daemon, over 87% of the messages had been sent by this process. The vast majority of these messages related to the device waking and sleeping – not through user interaction, but while the device was idle. Most of these “Wake” events occurred 2-5 minute apart, presumably to allow brief data connectivity to receive updates and push messages from apps.

Output from the script (iOS powerd messages)

Some powerd Wake and Sleep messages

The key thing that interested me about these messages was that they also noted the current battery-charge percentage in their text: this is the sort of data that just begs to be graphed, so I knocked up a little script which utilised the parsing module I had just written to extract just this data and present it in a graph-friendly manner.

Graph Friendly powerd Data

Graph Friendly Data

After graphing it (you want to use a scatter graph in Excel for this, not line as I discovered after some shouting at my screen) you are left with a graph which gives you some insight into the device’s use.

iOS Battery Use Graph

Some iPhone Power Usage (click for full-size)

The graph above shows around 3 weeks of battery usage data from the test handset. As noted previously, this test device would lay idle for days at a time (as suggested by the gentle downward gradients) but there were periods when the handset was in use, as shown by the steeper downward gradients on the 26th and 27th of April, which mostly took place within office hours. You can also clear see the points where the device was plugged in to be charged, suggested by the very steep upward gradients. As noted the power messages occur around ever 2-5 minutes, so the resolution is actually fairly good. The exception to this is while the device is plugged in as it no longer needs to sleep to preserve battery charge; typically I only saw an event when charging began and another when the device was unplugged and the battery began to discharge again.

There are a few other messages in the iOS ASL log that look interesting, but at this time I don’t have enough nice control data to make much of them. One thing that did hearten me somewhat was the fact that on the few extractions I’ve had the opportunity to take a look at from later revisions of iOS 5, there did seem to be some extra processes that were logging messages, so it’s my hope that we’ll see more and more useful data make its way into the ASL logs on iOS.

In addition to looking at iOS ASL files, I thought I’d take a look at some from an OSX installation. Pulling the logs from the “var/log/asl” on Lion (10.7.3) and running the parsing script across the whole directory brought back a far more varied selection of messages.

Output from the script (OSX)

Variety is the spice of life.

The number of records returned was actually far less than on iOS, partially due to the iOS “powerd” being so chatty, but more crucially because OSX tidies up its logs on a weekly basis. That’s not to say that you will only recover a week’s worth of logs though – on this test machine I recovered logs spanning 7 months. Rather, OSX has short-term log files (those with file names which begin with a timestamp) which have a shelf-life of a week and long term log files (those with file names which begin with “bb” followed by a timestamp). The “bb” in the long term log’s file name presumably stands for “best before” and the date, which is always in the future, is the date that the file should be cleared out. The short term log files tend to hold more “intimate” entries, often debug messages sent from 3rd party applications; the long term logs err more on the side of system messages. One particularly useful set of messages in the long term log are records pertaining to booting, shutting down, logins and logouts (hibernating, waking and failed logins are recorded too, but they end up in the short-term logs).

(As an aside:  one of my favourite things that I discovered when looking through these logs was the action of waking a laptop running OSX by wiggling a finger on the trackpad is recorded in the logs as a “HID Tickle”. Lovely.)

Like I did with the iOS power profiling, I put together a script which extracted these login and power records and timelines them.

OSX Login and power timeline

Login and power timeline

A couple of things worth noting beyond the basic boot/shutdown/login records: firstly when the device wakes it records why it happened – this can be quite specific: a USB device, the lid of a laptop being opened, the power button being pressed, etc. Secondly, you can see terminal windows (tty) being opened and closed as opening a terminal window involves logging in to a terminal session (OSX does this transparently, but it’s still logged).

We’ve released the scripts mentioned in this post to the community and they can be downloaded from https://code.google.com/p/ccl-asl/. The “ccl_asl” script is both a command line utility for dumping the contents of ASL files as well as a fully featured class module which you can use to write scripts along the lines of the battery profiler and login timeline scripts.

ASL files are, on the one hand, fairly dry system logs, but on the other, with a little work you can harvest some really insightful behavioural intelligence. As always if you have any questions, comments or suggestions you can contact us on research@ccl-forensics.com or leave a comment below.

Alex Caithness

XML and plist parser – updated version available

We’ve updated PIP – our XML and plist parser.

PIP has already proved incredibly popular, and is used by a number of investigation agencies across the world.

The new version is available for download here (license key purchase required) – and is a free upgrade to those who have already bought PIP.

We’ve listened to feedback received from the family of PIP users, and have introduced the following improvements:

Improved Interface – to improve work-flow, we have updated the application’s layout

New Tree View – see, at-a-glance, the structure of your data

Automatic building of XPaths – The Tree View can now be used to show PIP the data you are interested in – and PIP generates the XPath automatically.  This feature even works with Apple’s Property List ‘dictionary’ structures.

Import/Export Batch Jobs – Set-up a batch job of XPaths for a particular folder structure (iOS Library or Application folders for example) and then export the batch so that you, or anyone else in your lab can re-use it when you next come across the same data

Command line version – version 1.1 of PIP comes with the “pipcmd” command-line utility, allowing you to integrate PIP into tool chains and other automated tasks

To find out more, or to purchase PIP, please visit our PIP download page.