Friday 30 March 2012

Using Perl to Copy AppID Data from HTML to an SQLite Database


Someday You'll Find It ... The Jumplist Connection!

So all this talk of Windows 7 Jumplists (eg here and here) got me thinking - What if you needed to look up an AppID and didn't have access to the Internet? Also, Is there any way we can import a list of AppIDs so we can then access them from some future scripts (eg for processing Jumplists)?
I then had my "Aha!" moment (no, nothing to do with that 80s band), and thought "SQLite!"
SQLite also has the benefit of being cross-platform - so an AppID List generated on SIFT should work on a PC etc. By using a standard database, we can also ensure that our fearless forensicator lab uses a common set of (unchanging) source data.

So how exactly can we do this (using the SIFT VM)?
1. We create an empty SQLite Database.
2. We parse the ForensicsWiki AppID HTML page for data.
3. We populate the Database





Creating The Empty Database

Creating an (empty) SQLite AppID Database is pretty straight forward. I opted to do this part manually using SIFT's "sqlite3" command line tool.

1. At a terminal command line, I typed:
"sqlite3 WinAppIDs.sqlite"
2. At the resultant "sqlite3" prompt, I typed (without the quotes):
"CREATE TABLE IF NOT EXISTS WinAppIDs (AppID TEXT, AppDescrip TEXT, DateAdded NUMERIC, Source TEXT);" 
3. At the next "sqlite3" prompt, I typed (without the quotes):
"CREATE UNIQUE INDEX IF NOT EXISTS AppIdx ON WinAppIDs(AppID, AppDescrip);"
4. Finally, at the "sqlite3" prompt, I typed (without the quotes):
".quit"

There should now be a "WinAppIDs.sqlite" database file in the current directory (eg "/home/sansforensics"). This file contains a "WinAppIDs" table which has AppId, AppDescription, DateAdded and Source fields (just like the ForensicsWiki page tables).
Additionally, I have hacked/cheated a bit and created a Unique Index called "AppIdx". The purpose of the index is to ensure that we cannot insert duplicate entries. That is, we can't insert an entry with the same AppID AND the same AppDescription as an existing one. You wouldn't think this would be a problem with the AppId tables right? Wrong! In the tables there are entries such as:

Different AppIds, Same Descriptions
b8c29862d9f95832    Microsoft Office InfoPath 2010 x86
d64d36b238c843a3    Microsoft Office InfoPath 2010 x86

Same AppIDs, Different Descriptions
23646679aaccfae0    Adobe Reader 9.
23646679aaccfae0    Adobe Reader 9 x64

So rather than calling a SELECT query to check before every INSERT (there's 400+ AppIDs to insert), we'll just make the INSERT fail (via "AppIdx") when there's already an existing entry with the same AppID and AppDescription. For more on SQLite indexing read here. Specifically it says, "If the UNIQUE keyword appears between CREATE and INDEX then duplicate index entries are not allowed. Any attempt to insert a duplicate entry will result in an error."


Parsing the ForensicsWiki AppID HTML page for data and Populating the Database

Parsing/extracting the AppIDs from HTML is easy once you find out somebody's already written a CPAN package for that exact purpose! Its called "HTML::TableExtract" and you can read all about it here.
Additionally, the folks at ForensicsWiki have also included a "Source" link for each AppID. To extract the link we can use "HTML::LinkExtor". You can read about it here.
As we have seen in previous posts, we can use the Perl "DBI" package to interface with an SQLite Database. So knowing all this, we should now be able to insert our newly extracted data.
To simplify this exercise, we will extract the data and then enter it into the database from one script ("4n6wiki-parser.pl").
Incidentally, the StackOverflow website is a great searchable resource for Perl questions - this was where I not only heard about the previously mentioned packages but also saw example code of their use. Awesome!

Finally(!), we should write a short query script ("4n6wiki-query.pl") so that we can search our SQLite database by either AppID or AppDescrip.

Now let's take a look at what this monkey passes off for code ...

"4n6wiki-parser.pl" Code

#CODE BEGINS ON NEXT LINE
#!/usr/bin/perl -w

# Perl script to parse http://www.forensicswiki.org/wiki/List_of_Jump_List_IDs and import it into an existing SQLite Database

use strict;

use Getopt::Long;
use HTML::TableExtract;
use HTML::LinkExtor;
use DBI;

my $version = "4n6wiki-parser.pl v2012-03-27";
my $help = 0;
my $filename = "";
my $database = "";

GetOptions('help|h' => \$help,
    'file=s' => \$filename,
    'db=s' => \$database);

if ($help || $filename eq "" || $database eq "")
{
    print("\nHelp for $version\n\n");
    print("Perl script to parse http://www.forensicswiki.org/wiki/List_of_Jump_List_IDs and import it into an existing SQLite Database\n");
    print("\nUsage: 4n6wiki-parser.pl [-h|help] [-file filename] [-db database]\n");
    print("-h|help .......... Help (print this information). Does not run anything else.\n");
    print("-file filename ... Copy of HTML file to be parsed.\n");
    print("-db database ..... Target SQLite Database.\n");
    print("\nExample: 4n6wiki-parser.pl -file List_of_Jump_List_IDs.html -db WinAppIDs.sqlite\n");
    exit;
}

print "\nRunning $version\n\n";

open(my $htmlfile, "<".$filename) || die("Unable to open $filename for parsing\n");
my @lines = <$htmlfile>; # extract each line to a list element
my $html_string = join(' ', @lines); # join list elements into one big happy scalar string
close($htmlfile);

my $tblextract = HTML::TableExtract->new(keep_html => 1, headers => [qw(AppID Application Date Source)] );
$tblextract->parse($html_string);

my $lnkextr = HTML::LinkExtor->new();

my @bigarray; # will be a all table cells merged into one [row][cell] type array
my $rowcount = 0; # running count of rows in all tables

foreach my $tbl ($tblextract->tables)
{
    my @rows = $tbl->rows;
    my @cols = $tbl->columns;
    print "Extracted Table Size = ".scalar(@rows)." x ".scalar(@cols)."\n";

    for my $rownum (0 .. (@rows - 1))
    {
        # Ass-ume always 4 columns, last one (3) contains link info, col 2 contains date eg 8/22/2011
        for my $colnum (0 .. (@cols - 1))
        {
            if ($colnum == (@cols - 2) )
            {
                #reformat date into form yyyy-mm-dd
                my $date = $tbl->cell($rownum, $colnum);
                $date =~ /(\d+)\/(\d+)\/(\d+)/;
                my $year = $3;
                my $month = $1;
                my $day = $2;
                if ($day < 10)
                {
                    $day = "0".$day;
                }
                if ($month < 10)
                {
                    $month = "0".$month;
                }
                my $newdate = $year."-".$month."-".$day;
                $bigarray[$rowcount][$colnum] = $newdate;
            }
            elsif ($colnum == (@cols - 1))
            {       
                # Extract a link entry eg "http://social.technet.microsoft.com/Forums/" for col 3
                $lnkextr->parse($tbl->cell($rownum, $colnum));
                for my $link_tag ( $lnkextr->links )
                {
                    my ($tag, %links) = @$link_tag;
                    foreach my $key (keys %links)
                    {
                        $bigarray[$rowcount][$colnum] = $links{$key};
                        last; # only do it for one link then bail out
                    }
                }
            }
            else
            {
                #Record each of the other column fields for this row ie col 0, 1, 2
                $bigarray[$rowcount][$colnum] = $tbl->cell($rownum, $colnum);
            }
        }
        $rowcount = $rowcount + 1;       
    }
}

print "Number of Rows Extracted from HTML = $rowcount\n";

my $db = DBI->connect("dbi:SQLite:dbname=$database","","") || die( "Unable to connect to database\n" );
my $sth;

for my $currow (0 .. ($rowcount-1))
{
    my $id = $bigarray[$currow][0];
    my $descrip = $bigarray[$currow][1];
    my $date = $bigarray[$currow][2];
    my $source = $bigarray[$currow][3];
    $sth =  $db->prepare_cached("INSERT INTO WinAppIDs (AppID, AppDescrip, DateAdded, Source) VALUES (?, ?, ?, ?)"); #or die "Couldn't prepare statement: ".$db->errstr;
    $sth->execute($id, $descrip, $date, $source); #or die "Couldn't execute statement: ".$sth->errstr;
}
$sth->finish;
$db->disconnect;
print "$database Update complete ...\n"

# END CODE


"4n6wiki-parser.pl" Code Summary

Before you use this script, ensure that you have installed the following Perl packages: HTML::TableExtract, HTML::LinkExtor, DBI (eg type "sudo cpan HTML::TableExtract" etc.).

The first section calls "GetOptions" to parse the user's input arguments and then the following section is the Help printout.

We then read in the user specified HTML file into one long scalar variable ("$html_string").

Next we create a new HTML::TableExtract object to parse the table data. We use the argument "keep_html => 1" so we can later parse any web links in the table (using HTML::LinkExtor). The other argument, "headers => [qw(AppID Application Date Source)]" tells HTML::TableExtract the column headings so it can figure out how to extract data.
Note: I just used the first word of each column heading and it seemed to work OK. I don't think you can specify something like "Application Description" for a heading because the whitespace will confuse HTML::TableExtract. If they ever change the column headings to use the same first word (eg "AppID" and "AppID Description"), we're in trouble.
Anyhow, next we call the "parse" function of HTML::TableExtract to do all the heavy lifting and extract the data.

Now we create a new HTML::LinkExtor object to extract any HTML links in the table. We then proceed to loop through each row/column of each table that HTML::TableExtract has found (currently, there are 10 tables). We store the result in a huge array of arrays ("bigarray[rownum][colnum]").
For half of the columns (ie "AppId" and "Application Description"), we will just copy the data straight into "bigarray". The other 2 columns ("Date Added" and "Source") require a bit of massaging before entry (heheh).
ForensicsWiki lists the dates in the U.S. format of month/day/year (I DARE GU to say it's a better format! LOL) with no preceding zeros for single digits. SQLite uses the yyyy-mm-dd format in it's time calculations. So, just in case we want to sort later by date, our script has to re-arrange the date before storing it in "bigarray". We use a regular expression to extract the various parts of the date and then zero pad them if required before re-assembling them into the SQLite format. The regular expression code is:

"$date =~ /(\d+)\/(\d+)\/(\d+)/;"

 In the line before this, "$date" was assigned the value from the date cell (eg "8/22/2011"). The "\d+"s indicate a field of multiple digits. The surrounding "( )"s indicate we are extracting the various values (in order) to $1, $2, $3. We also need to "\" the date separators "/" so they don't confuse the regular expression delimiters. More information on Perl Regular expressions is available here.

I found the basic code for the Link Extraction here. However, we call the HTML::LinkExtor's "parse" function with the table cell content (not the whole row). I found it quite difficult to initially understand this snippet, so I will try to explain using our actual code below.

$lnkextr->parse($tbl->cell($rownum, $colnum));
for my $link_tag ( $lnkextr->links )
{
    my ($tag, %links) = @$link_tag;
    foreach my $key (keys %links)
    {
        $bigarray[$rowcount][$colnum] = $links{$key};
        last; # only do it for one link then bail out
    }
}


To get the results of the HTML::LinkExtor's "parse" call, we have to call the "$lnkextr->links" method. According to the CPAN documentation, this returns data in the form:
"[$tag, $attr => $url1, $attr2 => $url2,...]".
The enclosing "[ ]"s denote an array containing the "$tag" scalar variable and what looks like a hash list ("$attr => $url1, $attr =>$url2").
Theoretically, there should only be one "$link_tag" result for our case and therefore, only one for ("my $link_tag") loop iteration.

In the loop, we now see: "my ($tag, %links) = @$link_tag;"
The right hand hand side, "@$link_tag" is saying treat this "$link_tag" scalar as a reference to an array (hence the "@"). This array will be the one containing the "$tag" and the hash list mentioned previously.
The left hand side assigns values for the various parts of the array. So we end up with "$tag" and a hash list array (now called "%links").
The next step is to iterate through "%links" and for the first key/URL it finds, store that link in "bigarrray".
For more information on Perl References see here.

OK, now we should have all the data we need in "bigarray" and we also have an SQLite database file ("WinAppIDs.sqlite") containing an empty table ("WinAppIDs"). Time to feed the machine!
As we've seen in previous posts, we first call DBI's "connect" to interface with the Database. If it doesn't work we quit and nothing gets inserted.
Next we loop through each "row" of "bigarray" and store the column values in temporary scalars ($id, $descrip, $date, $source). We then prepare our SQLite INSERT statement:

$sth =  $db->prepare_cached("INSERT INTO WinAppIDs (AppID, AppDescrip, DateAdded, Source) VALUES (?, ?, ?, ?)"); #or die "Couldn't prepare statement: ".$db->errstr;

We're using "prepare_cached" because it apparently minimises resources for our repeated INSERTs (seriously?!) compared to a normal "prepare".
The "?" arguments are placeholders for our temporary values which get assigned in the next line.

$sth->execute($id, $descrip, $date, $source); #or die "Couldn't execute statement: ".$sth->errstr;

Note: I've commented out the ("die") error handling parts for "prepare_cached" and "execute" because if we fail to INSERT, it could be due to a duplicate. We should keep looping though "bigarray" because there might be new non-duplicate information later in "bigarray" (eg we might be updating an existing database).

After we've finished looping through "bigarray", we call "finish" and "close" and print out an "Update Complete" message.


"4n6wiki-query.pl" Code

# CODE BEGINS ON NEXT LINE
#!/usr/bin/perl -w

# Perl script to query an existing SQLite Database for Windows AppID info
# Intended to be used in conjunction with the 4n6wiki-parser.pl script.

use strict;

use Getopt::Long;
use DBI;

my $version = "4n6wiki-query.pl v2012-03-27";
my $help = 0;
my $appid = "";
my $descrip = "";
my $database = "";

GetOptions('help|h' => \$help,
    'appid=s' => \$appid,
    'descrip=s' => \$descrip,
    'db=s' => \$database);

if ($help || $database eq "" || ($descrip eq "" && $appid eq "") || ($descrip ne "" && $appid ne "") )
{
    print("\nHelp for $version\n\n");
    print("Perl script to query an existing SQLite Database for Windows AppID info\n");
    print("Intended to be used in conjunction with the 4n6wiki-parser.pl script.\n"); 
    print("\nUsage: 4n6wiki-query.pl [-h|help] [-appid APPID] [-descrip Description] [-db database]\n");
    print("-h|help .......... Help (print this information). Does not run anything else.\n");
    print("-appid APPID ..... Search Database for match(es) for this APPID.\n");
    print("-descrip Description ... Search Database for match(es) \"like\" this Description.\n");
    print("-db database ..... Target SQLite Database.\n");
    print("\nExamples: \n");
    print("4n6wiki-parser.pl -appid 3dc02b55e44d6697 -db WinAppIDs.sqlite\n");
    print("4n6wiki-parser.pl -descrip \"Adobe Flash\" -db WinAppIDs.sqlite\n");
    print("Note: Listing BOTH -appid and -descrip will not work / prints this message\n\n");
    exit;
}

print "\nRunning $version\n\n";

my $db = DBI->connect("dbi:SQLite:dbname=$database","","") || die( "Unable to connect to database\n" );
my $sth;
if ($appid ne "")
{
    $sth =  $db->prepare("SELECT AppID, AppDescrip FROM WinAppIDs WHERE AppID=?") or die "Couldn't prepare statement: ".$db->errstr;
    $sth->execute($appid) or die "Couldn't execute statement: ".$sth->errstr;
    PrintHeadings($sth);
    PrintResults($sth);
    if ($sth->rows == 0)
    {
        print "No Matching AppIDs found!\n";
    }
    else
    {   
        print $sth->rows." Matches returned\n";
    }
}
elsif ($descrip ne "")
{
    my $likestr = "%".$descrip."%";
    $sth =  $db->prepare("SELECT AppID, AppDescrip FROM WinAppIDs WHERE AppDescrip LIKE ?") or die "Couldn't prepare statement: ".$db->errstr;
    $sth->execute($likestr) or die "Couldn't execute statement: ".$sth->errstr;
    PrintHeadings($sth);
    PrintResults($sth);
    if ($sth->rows == 0)
    {
        print "No Matching Descriptions found!\n";
    }
    else
    {   
        print $sth->rows." Matches returned\n";
    }
}

$sth->finish;
$db->disconnect;

# End Main

sub PrintHeadings
{
    my $sth = shift;

    # Print field headings
    for (my $i = 0; $i <= $sth->{NUM_OF_FIELDS}-1; $i++)
    {
        if ($i == $sth->{NUM_OF_FIELDS} - 1)
        {
            print $sth->{NAME}->[$i]."\n"; #last item adds a newline char
        }
        else
        {   
            print $sth->{NAME}->[$i]." | ";
        }
    }
}

sub PrintResults
{
    my $sth = shift;
    my @rowarray;

    # Prints row by row / field by field
    while (@rowarray = $sth->fetchrow_array() )
    {
        for (my $i = 0; $i <= $sth->{NUM_OF_FIELDS}-1; $i++)
        {
            if ($i == $sth->{NUM_OF_FIELDS} - 1 )
            {
                print $rowarray[$i]."\n"; #last field in row adds newline
            }
            else
            {
                if ($rowarray[$i])
                {
                    print $rowarray[$i]." | ";
                }
                else
                {
                    print " | "; # field returned could be UNDEFINED, just print separator
                }
            }
        }
    }
}

# CODE ENDS


"4n6wiki-query.pl" Code Summary

This script is very similar to our previous "ffparser.pl" script. It will use DBI to run SELECT queries against a user nominated SQLite Database. There will be 2 types of queries - a search by AppID and a search for LIKE terms in AppDescrip. We won't use an exact match for AppDescrip because the analyst might not know the exact description (eg they just know it's for CCleaner).
As usual, it starts off with "GetOptions" and the Help printout sections.
We then call DBI's "connect" and one of the versions of "prepare" (depending on the user's input arguments).
Next we call "PrintHeadings" and "PrintResults" (we cut and pasted these functions from "ffparser.pl").
Finally, we call "finish" and "disconnect".

Testing

To test our scripts we will save a copy of the current ForensicsWiki AppID page  as "List_of_Jump_List_IDs.html". We will then edit/remove the entry for the "Win32 cmd.exe" in the last row of the last table and re-save it as "List_of_Jump_List_IDs-withoutCMD.html".
Now we can simulate an update by first running "4n6wiki-parser.pl" with "List_of_Jump_List_IDs-withoutCMD.html" and then again with "List_of_Jump_List_IDs.html". After our first run, our SQLite Database (created previously) should not contain our "Win32 cmd.exe" entry and then after the second run, it should contain our "Win32 cmd.exe" entry.

OK, here we run "4n6wiki-parser.pl" with "List_of_Jump_List_IDs-withoutCMD.html". (ie no "Win32 cmd.exe" entry)

sansforensics@SIFT-Workstation:~$ ./4n6wiki-parser.pl -file List_of_Jump_List_IDs-withoutCMD.html -db WinAppIDs.sqlite


Running 4n6wiki-parser.pl v2012-03-27


Extracted Table Size = 53 x 4
Extracted Table Size = 31 x 4
Extracted Table Size = 10 x 4
Extracted Table Size = 87 x 4
Extracted Table Size = 50 x 4
Extracted Table Size = 16 x 4
Extracted Table Size = 66 x 4
Extracted Table Size = 30 x 4
Extracted Table Size = 2 x 4
Extracted Table Size = 55 x 4
Extracted Table Size = 7 x 4
Number of Rows Extracted from HTML = 407
WinAppIDs.sqlite Update complete ...
sansforensics@SIFT-Workstation:~$


OK now we run a search for the "cmd" term in our "WinAppIDs.sqlite" database.

sansforensics@SIFT-Workstation:~$ ./4n6wiki-query.pl -descrip "cmd" -db WinAppIDs.sqlite


Running 4n6wiki-query.pl v2012-03-27


AppID | AppDescrip
6728dd69a3088f97 | Windows Command Processor - cmd.exe (64-bit)
1 Matches returned
sansforensics@SIFT-Workstation:~$

As expected, it only returns the value for the 64bit cmd.exe.
Now let's run "4n6wiki-parser.pl" with "List_of_Jump_List_IDs.html" (ie includes "Win32 cmd.exe" entry).

sansforensics@SIFT-Workstation:~$ ./4n6wiki-parser.pl -file List_of_Jump_List_IDs.html -db WinAppIDs.sqlite


Running 4n6wiki-parser.pl v2012-03-27


Extracted Table Size = 53 x 4
Extracted Table Size = 31 x 4
Extracted Table Size = 10 x 4
Extracted Table Size = 87 x 4
Extracted Table Size = 50 x 4
Extracted Table Size = 16 x 4
Extracted Table Size = 66 x 4
Extracted Table Size = 30 x 4
Extracted Table Size = 2 x 4
Extracted Table Size = 55 x 4
Extracted Table Size = 8 x 4
Number of Rows Extracted from HTML = 408
DBD::SQLite::st execute failed: columns AppID, AppDescrip are not unique at ./4n6wiki-parser.pl line 114.
DBD::SQLite::st execute failed: columns AppID, AppDescrip are not unique at ./4n6wiki-parser.pl line 114.
DBD::SQLite::st execute failed: columns AppID, AppDescrip are not unique at ./4n6wiki-parser.pl line 114.
[... Lines Edited Out]
DBD::SQLite::st execute failed: columns AppID, AppDescrip are not unique at ./4n6wiki-parser.pl line 114.
WinAppIDs.sqlite Update complete ...
sansforensics@SIFT-Workstation:~$

OK so we are getting some errors about duplicate entries but the last table seems to have an extra entry 8 x 4 (not 7 x 4). This looks promising!
We'll run a query and check ...

sansforensics@SIFT-Workstation:~$ ./4n6wiki-query.pl -descrip "cmd" -db WinAppIDs.sqlite


Running 4n6wiki-query.pl v2012-03-27


AppID | AppDescrip
6728dd69a3088f97 | Windows Command Processor - cmd.exe (64-bit)
bc0c37e84e063727 | Windows Command Processor - cmd.exe (32-bit)
2 Matches returned
sansforensics@SIFT-Workstation:~$


YAY! The update worked!

Let's run a quick query using the bc0c37e84e063727 AppId (for "Win32 cmd.exe") and see what results we get:

sansforensics@SIFT-Workstation:~$ ./4n6wiki-query.pl -appid bc0c37e84e063727 -db WinAppIDs.sqlite

Running 4n6wiki-query.pl v2012-03-27

AppID | AppDescrip
bc0c37e84e063727 | Windows Command Processor - cmd.exe (32-bit)
1 Matches returned
sansforensics@SIFT-Workstation:~$


OK we found our latest entry.

Just for fun, let's check how many entries we have via "sqlite3".

sansforensics@SIFT-Workstation:~$ sqlite3 WinAppIDs.sqlite
SQLite version 3.6.16
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite>
sqlite> select count (AppId) from WinAppIDs;
408
sqlite>

Now let's see if that corresponds with the SQLite Manager Firefox plugin.

Opening our Database with the "SQLite Manager" Firefox Plugin

Note(1): The HTML links have been extracted into the table Source column.
Note(2): There are 408 records reported ie the same as what our previous "sqlite3" query reported. The latest record is our "Win32 cmd.exe" record.

Our last check (with "sqlite3") is to see if  our database's DateAdded field is usable by SQLite (we'll only grab the first 3 records). So here's those records with our inputted date format:

sqlite> select AppId, AppDescrip, DateAdded from WinAppIDs limit 3;
65009083bfa6a094|(app launched via XPMode)|2011-08-22
469e4a7982cea4d4|? (.job)|2011-08-22
b0459de4674aab56|(.vmcx)|2011-08-22
sqlite>

And here's the same results with the date field transformed into "number of days since now":

sqlite> select AppId, AppDescrip, julianday('now') - julianday(DateAdded) from WinAppIDs limit 3;
65009083bfa6a094|(app launched via XPMode)|220.419525995385
469e4a7982cea4d4|? (.job)|220.419525995385
b0459de4674aab56|(.vmcx)|220.419525995385
sqlite>

So we can see from 22 August 2011 until (now) 29 March 2012 has been calculated as ~220 days.
A manual check: 9 remaining days of August + 30 Sept days + 31 Oct days + 30 Nov days + 31 Dec days + 31 Jan days + 29 Feb days + 29 Mar days = 220
Coolio! Our date field is usable by SQLite.

Summary

OK so everything seems to work. We have extracted various fields from the ForensicsWiki AppId web page and then entered that data into an SQLite Database. We have then successfully queried that database for some information.
The code from "4n6wiki-query.pl" would be easy to cut and paste into existing scripts that currently only output the AppID thus possibly saving analysts some time.
I'm not sure if this will be of use to any forensicators but it was interesting from a Perl programming perspective. If you think you might find it useful, please leave a comment.

Thursday 22 March 2012

Perl Parsing an SQLite Database File Header


Previously on Cheeky4n6Monkey ... we used the Perl DBI package to retrieve the contents of Firefox's (v3.5.17 and v11.0) "places.sqlite" and "downloads.sqlite". A secondary discovery was that depending on the version of the sqlite3 command line exe, one might be able to use it to read Firefox 11 ".sqlite" files ... or not. Just to clarify, sqlite3 v3.6.16 cannot read Firefox 11 files but sqlite3 (v3.7.10) can.
Anyway, rather than waste time trying different versions of sqlite3 on various ".sqlite" files, I thought a Perl script to parse the ".sqlite" file header might be handy.
Even if it's not particularly unique/useful (there's already an SQLite Manager Firefox plugin), at least we get more practice with Perl. Also, as far as I know, SQLite Manager does not show the SQLite Version nor does it show a meaningful Schema Format string.

The relevant SQLite file structure is detailed here. Rather than extract all of the fields, I thought I would initially concentrate on 6 fields:
- SQLite Version eg 3007007
- SQLite Schema Format eg SQLite v3.0.0
- Page Size (in bytes)
- Number of Pages
- Number of Pages on Freelist
- Text Encoding used eg UTF-8

The last 4 fields can be verified/viewed using SQLite Manager and the first 2 fields can be verified/viewed using the SIFT's Bless Hex Editor.

So lets get stuck into the coding!


Code

# CODE BEGINS AFTER THIS LINE
#!/usr/bin/perl -w

# Perl script to parse selected SQLite Database header fields
# Based on the Database Header section of http://www.sqlite.org/fileformat2.html

use strict;

use Getopt::Long;
use Encode;

my $version = "sqlite-parser.pl v2012-03-21";
my $help = 0;
my $filename = "";

GetOptions('help|h' => \$help,
    'file=s' => \$filename);

if ($help || $filename eq "")
{
    print("\nHelp for $version\n\n");
    print("Perl script to parse selected SQLite header fields\n");
    print("\nUsage: sqlite-parser.pl [-h|help] [-file filename]\n");
    print("-h|help .......... Help (print this information). Does not run anything else.\n");
    print("-file filename ... sqlite filename to be parsed.\n");
    print("\nExample: sqlite-parser.pl -file /cases/firefox/places.sqlite\n");
    exit;
}

print "\nRunning $version\n\n";

# Try read-only opening the SQLite file to extract various header information
my $rawsqliteheader;

open(my $sqlitefile, "<".$filename) || die("Unable to open $filename for header parsing\n");
#binmode($sqlitefile);
sysread ($sqlitefile, $rawsqliteheader, 100) || die("Unable to read $filename for header parsing\n");;
close($sqlitefile);

# First check that we have a valid header 1st 16 bytes should read "SQLite format 3\000" in UTF8
# or alternatively "53 51 4c 69 74 65 20 66 6f 72 6d 61 74 20 33 00" in hex
my $rawsqlitestring = substr($rawsqliteheader, 0, 16);
my $sqlitestring = decode("UTF8", $rawsqlitestring);
if ($sqlitestring eq "SQLite format 3\000")
{
    print "SQLite String Parsed OK - Continuing Processing of $filename ...\n";
}
else
{
    print "$filename does NOT have a Valid SQLite String in Header! Bailing out ...\n\n";
    exit;
}


# Extract the database page size in bytes
# Should/Must be a power of two between 512 and 32768 inclusive, or the value 1 representing a page size of 65536
my $rawdbpagesize = substr($rawsqliteheader, 16, 2);
my $dbpagesize = unpack("n", $rawdbpagesize); # Use "n" for 2 byte int
if ($dbpagesize eq 1)
{
    print("Database Page Size (bytes) = 65536\n");
}
else
{
    print("Database Page Size (bytes) = $dbpagesize\n");
}


# Extract the size of the database file in pages.
my $rawnumpages = substr($rawsqliteheader, 28, 4);
my $numpages = unpack("N", $rawnumpages); # use "N" for 4 byte int
if ($numpages ne 0)
{
    # Must check that changecounter = validversionfor
    # as validversionfor stores the current changecounter value after the SQLite version number was written
    # (eg at creation)
    my $rawchangecounter = substr($rawsqliteheader, 24, 4);
    my $changecounter = unpack("N", $rawchangecounter);

    my $rawvalidversionfor = substr($rawsqliteheader, 92, 4);
    my $validversionfor = unpack("N", $rawvalidversionfor);

#    print "changecounter = $changecounter\n";
#    print "validversionfor = $validversionfor\n";

    if ($changecounter eq $validversionfor)
    {
        print("Valid Number of Pages = $numpages\n");
    }
    else
    {
        print("Invalid Number of Pages! (mismatched changecounter value)\n");
    }
}
else
{
    print("Invalid Number of Pages! (zero)\n");
}


# Extract the total number of freelist pages.
my $rawnumfreelistpages = substr($rawsqliteheader, 36, 4);
my $numfreelistpages = unpack("N", $rawnumfreelistpages); # use "N" for 4 byte int
print("Total Number of Freelist Pages = $numfreelistpages\n");


# Extract the schema format number. Supported schema formats are 1, 2, 3, and 4.
my $rawschemaformatnum = substr($rawsqliteheader, 44, 4);
my $schemaformatnum = unpack("N", $rawschemaformatnum); # use "N" for 4 byte int
#print("Schema Format Number = $schemaformatnum\n");
if ($schemaformatnum == 1)
{
    print("Schema Format = SQLite v3.0.0\n");
}
elsif ($schemaformatnum == 2)
{
    print("Schema Format = SQLite v3.1.3 (2005)\n");
}
elsif ($schemaformatnum == 3)
{
    print("Schema Format = SQLite v3.1.4 (2005)\n");
}
elsif ($schemaformatnum == 4)
{
    print("Schema Format = SQLite v3.3.0 (2006) or higher\n");
}
else
{
    print("Invalid Schema Format!\n");
}


# Extract the database text encoding. A value of 1 means UTF-8. A value of 2 means UTF-16le. A value of 3 means UTF-16be.
my $rawtextencode = substr($rawsqliteheader, 56, 4);
my $textencode = unpack("N", $rawtextencode); # use "N" for 4 byte int

#print("Text Encoding = $textencode\n");
if ($textencode == 1)
{
    print("Text Encoding = UTF-8\n");
}
elsif ($textencode == 2)
{
    print("Text Encoding = UTF-16le\n");
}
elsif ($textencode == 3)
{
    print("Text Encoding = UTF-16be\n");
}
else
{
    print("Invalid Text Encoding!\n");
}


# Extract the SQLite Version number as a 4 byte Big Endian Integer at bytes 96-100
# The version number will be in the form (X*1000000 + Y*1000 + Z)
# where X is the major version number (3 for SQLite3), Y is the minor version number and Z is the release number
# eg 3007004 for 3.7.4
my $rawsqliteversion = substr($rawsqliteheader, 96, 4);
my $sqlversion = unpack("N", $rawsqliteversion);
print("SQLite Version is: $sqlversion\n\n");

# CODE ENDS HERE


Code Summary

As ".sqlite" files contain a UTF-8 encoded header string we need to use the "decode" function from the Encode Perl package. So the first thing to note is the "use Encode" statement.
Next, we have the familiar GetOptions and Help sections. Yawn!
And now the journey really begins ...
We call a read-only "open" on the user specified filename and then "sysread" in the first 100 bytes into the "$rawsqliteheader" variable.
Now that we have a copy of what should be the file header, we can now "close" the user specified file.

The first thing we do is look for the UTF-8 encoded "SQLite format 3\000" string. We use "substr" to copy the first 16 bytes into "$rawsqlitestring" from our 100 byte "$rawsqliteheader" buffer.
Next we call "decode" to convert "$rawsqlitestring" into a Perl usable string format and store the result in "$sqlitestring". See here for more information on encoding/decoding with Perl.
Finally, we test the value of "$sqlitestring" and "exit" if it does not match "SQLite format 3\000".
Pretty straight forward eh?

Similarly, for the Database Page Size field - we call "substr" to copy our bytes (in this case only 2) but instead of calling "decode" we use "unpack" for interpreting numbers. There's a bit of a hack going on with this particular field - if it's value is 1, the actual size used is supposed to be 65536 bytes. So we include an "if" statement to handle that.

I could go on but the other parsing sections do pretty much the same thing (except they call "substr" to extract 4 bytes from various offsets).


Testing

Here's the output of our "sqlite-parse.pl" script with a bogus fabricated file we prepared for an earlier post:

sansforensics@SIFT-Workstation:~$ sqlite-parser.pl -file /cases/cheeky-file.c4n6

Running sqlite-parser.pl v2012-03-21

/cases/cheeky-file.c4n6 does NOT have a Valid SQLite String in Header! Bailing out ...

sansforensics@SIFT-Workstation:~$


OK so that shows that our's is a discerning script ... we don't accept any old riff-raff claiming to be SQLite!
Now lets try it with a Firefox 11.0 "places.sqlite" file:

sansforensics@SIFT-Workstation:~$ sqlite-parser.pl -file /cases/firefox11/places.sqlite

Running sqlite-parser.pl v2012-03-21

SQLite String Parsed OK - Continuing Processing of /cases/firefox11/places.sqlite ...
Database Page Size (bytes) = 4096
Valid Number of Pages = 145
Total Number of Freelist Pages = 5
Schema Format = SQLite v3.1.3 (2005)
Text Encoding = UTF-8
SQLite Version is: 3007007

sansforensics@SIFT-Workstation:~$


That looks OK but we should validate our results using the SIFT's SQLite Manager Firefox plugin.

"SQLite Manager" Firefox Plugin Validating Our Script Output

As you can see, the SQLite Manager confirms our scripts "Database Page Size" (4096), "Valid Number of Pages" (145), "Total Number of Freelist Pages" (5) and "Text Encoding" (UTF-8) values.


Now we will use the SIFT's Bless Hex Editor to validate our script's results for the "SQLite Version"  and "Schema Format". You can access Bless from the Applications ... Programming ... Bless Hex Editor menu.

Bless Hex Editor Validating Our Script's outputted SQLite Version (3007007)

Bless Hex Editor Validating Our Script's outputted Schema Format (2)

OK, the Bless screenshots confirm our script's "SQLite Version" (3007007) and "Schema Format" (2, which corresponds to "SQLite v3.1.3"). Dare I say, we've been blessed? heheh

For extra giggity-giggity-giggles, lets now use our script with a Firefox 3.5.17 "places.sqlite" file:

sansforensics@SIFT-Workstation:~$ sqlite-parser.pl -file ffexamples/places.sqlite


Running sqlite-parser.pl v2012-03-21


SQLite String Parsed OK - Continuing Processing of ffexamples/places.sqlite ...
Database Page Size (bytes) = 4096
Valid Number of Pages = 203
Total Number of Freelist Pages = 0
Schema Format = SQLite v3.0.0
Text Encoding = UTF-8
SQLite Version is: 3007004


sansforensics@SIFT-Workstation:~$

We can see that Firefox 3.5.17 and Firefox 11.0 use different schemas and SQLite versions for their respective "places.sqlite" files. For Firefox 11.0, Schema Format = SQLite v3.1.3 (2005), SQLite Version is: 3007007. For Firefox 3.5.17, Schema Format = SQLite v3.0.0, SQLite Version is: 3007004.
This information would have been handy before we started trying to open Firefox 11.0 database files with a superseded copy of sqlite3 eh?

Anyway, so ends another post. Please let me know if you think/find it useful or if you would like any more header fields printed.Cheers!

Wednesday 21 March 2012

Using Perl to View an SQLite Database

 
Warning: This is one of my longer rambles posts and there's not many pics either.

According to the SQLite website, SQLite is an open source, cross-platform database library that is used in software applications (eg DropBox and Skype clients), gadgets (eg Nokia phones, MP3 players, PDAs) and website databases. As SQLite becomes more prevalent, it seems prudent for forensicators to be able to retrieve/view the contents of these databases. Thus, Enter the Perl Monkey!
Special mention/unashamed name drop must go to Detective Cindy Murphy who mentioned the increasing prevalence of SQLite in the mobile phone arena which in turn, inspired this post.

In brief, SQLite files are divided into equal sized pages. The minimum file size is one page (the minimum page size is 512 bytes), the maximum file size is ~140 TB. While page types can vary, each page size within a single database file is constant. Page sizes can/will differ between different database files. For those interested, the SQLite file format is defined here. 
Richard Drinkwater's blog has some excellent information on carving SQLite databases. Andrew Hoog has also released a set of carving slides here.

Firefox and SQLite

One of the more popular applications to use SQLite is Firefox. Firefox 3.6.17 is installed on SIFT V2.12 - which will probably be the last of the 3.6 branch. Firefox 11 is the latest "main" branch release. Both versions feature "Private Browsing" - the ability to automatically clear a User's History, Downloads and Cookies.
In 2011, Alex blogged about Firefox 4 forensics here and Tsurany also wrote about Firefox's Private Browsing Mode here.
For more information on Firefox and SQLite, I highly recommend "Digital Forensics With Open Source Tools" by Altheide and Carvey (p 147 onwards).

In 2009, Kristinn Gudjonsson wrote an awesome Firefox History Perl script which he described in further detail here.

To improve my overall understanding (of both Perl and SQLite), I have written a stripped down version of Kristinn's script. It's kinda re-inventing the wheel but I think I will improve my understanding/retention by doing rather than just reading. And given the stripped down nature of my script, it will also take less time for me to explain.

But before monkeying around with the code, we need to find out about the relevant Firefox database schema (eg table names, keys, fields). Here is the schema for "places.sqlite" as defined by ForensicsWiki and also via my own observations:

Firefox's "places.sqlite"schema


Note: One side effect of having "moz_bookmarks" linked with "moz_places" is that "places.sqlite" will always contain any bookmarked site information regardless of CCleaner use or "Private Browsing".

Here is the schema for "downloads.sqlite":

Firefox's "downloads.sqlite" schema

According to ForensicsWiki, Firefox stores its SQLite files in the following places:
- Windows XP: C:\Documents and Settings\<username>\Application Data\Mozilla\Firefox\Profiles\<profile folder>\places.sqlite
- Windows Vista: C:\Users\<user>\AppData\Roaming\Mozilla\Firefox\Profiles\<profile folder>\places.sqlite
- Linux: /home/<user>/.mozilla/firefox/<profile folder>/places.sqlite
- MacOS: /Users/<user>/Library/Application Support/Firefox/Profiles/default.lov/places.sqlite

While each of these ".sqlite" files contains a SQLite database, some of them (eg "places.sqlite") contain multiple tables (eg moz_places, moz_historyvisits, moz_bookmarks). When running, Firefox also creates temporary ".sqlite" files with the extensions ".sqlite-shm" for shared memory (eg "places.sqlite-shm") and also ".sqlite-wal" for write ahead logs (eg "places.sqlite-wal"). As you can observe these at run time with Windows Explorer, theres a good chance they are carvable/recoverable.

Firefox 11 has the following files: addons.sqlite, chromeappsstore.sqlite, content-prefs.sqlite, cookies.sqlite, downloads.sqlite, extensions.sqlite, formhistory.sqlite, permissions.sqlite, places.sqlite, search.sqlite, signons.sqlite, webappsstore.sqlite.

Firefox 3.6.17 has the following sqlite files:
content-prefs.sqlite, cookies.sqlite, downloads.sqlite, formhistory.sqlite, permissions.sqlite, places.sqlite, search.sqlite, signons.sqlite,  webappsstore.sqlite, urlclassifier3.sqlite.

Note: The bold entries denote the differing files between Firefox versions. Its also easy to miss files when switching between browsing modes / using CCleaner so take this list with a grain of salt.

For our script, we will be printing the Downloads, Bookmarks, and History information only. Unlike Kristinn's script we won't offer html output nor detect time-skewing.
As we saw in the schema diagram above, the Bookmarks and History information is stored in the "places.sqlite" file. This file contains the "moz_places", "moz_bookmarks" and "moz_historyvisits" tables.
The Downloads information is stored in the "moz_downloads" table from the "downloads.sqlite" file.

Viewing SQLite Tables

There are several SQLite Database browsers around but not all of them will be able to open the later releases of SQLite (the latest is 3.7.10).
The easiest one to use is probably the Firefox Add-on GUI called SQLite Manager. I have used it to view the contents of Firefox 11 (uses SQLite V3.7.7) ".sqlite"  files. The SQLite Manager installed on SIFT appears to use SQLite V3.7.4.
On SIFT, there is also an sqlite3 command line exe installed (in "/usr/local/bin/sqlite3"). Unfortunately, it seems that the current installed version (V3.6.16) cannot read Firefox 11 ".sqlite" files. It complains that the file is encrypted or not a database file. It will read SIFT's own Firefox 3.5.17 ".sqlite" files however.
Consequently, I have downloaded the latest sqlite3 exe (V3.7.10) from the official SQLite website and then used it to successfully read Firefox 11 ".sqlite" files. For more information on using the sqlite3 command line interface I highly recommend "Digital Forensics With Open Source Tools" (Altheide and Carvey) p.150.

For Perl scripts, we need to ensure that our SIFT's DBI (Database Interface?) package is up to date by typing: "sudo cpan DBI" (Warning: it takes a while to update!).
Incidentally, there are other Perl CPAN packages available (ie DBD::SQLite and SQLite::DB) which are supposed to make it easier to interact with SQLite databases (eg make it more Object Oriented) but whilst I found they were able to read Firefox 3.5.17 ".sqlite" files, they were not able to read Firefox 11 ".sqlite" files.
So that's when I decided to re-write my initial script using both Kristinn's script and this article by Mark-Jason Dominus as guides.

Code

OK so here's my script code:

# CODE BEGINS AFTER THIS LINE
#!/usr/bin/perl -w

# Perl script to parse Firefox places.sqlite and downloads.sqlite
# Based on Kristinn Gudjonsson's "ff3histview" Perl script (http://blog.kiddaland.net/dw/ff3histview) and
# Mark-Jason Dominus's "A Short Guide to DBI" article (http://www.perl.com/pub/1999/10/DBI.html)
# Works with SIFT's Firefox V3.6.17 and WinXP's Firefox V11.0
# Be sure to run "sudo cpan DBI" to update the DBI Perl package before running!

use strict;

use Getopt::Long;
use DBI;

my $version = "ffparser.pl v2012-03-19";
my $help = 0;
my $bk = 0;
my $dload = 0;
my $hist = 0;
my $path = "";

GetOptions('help|h' => \$help,
    'bk' => \$bk,
    'dload' => \$dload,
    'hist' => \$hist,
    'path=s' => \$path);

if ($help || $path eq "" || (!$bk and !$dload and !$hist))
{
    print("\nHelp for $version\n\n");
    print("Perl script to parse Firefox places.sqlite and downloads.sqlite\n");
    print("\nUsage: ffparser.pl [-h|help] [-path pathname] [-bk] [-dload] [-hist]\n");
    print("-h|help .......... Help (print this information). Does not run anything else.\n");
    print("-path pathname ... Path to folder containing places.sqlite and downloads.sqlite.\n");
    print("-bk .............. Parse for Bookmarks (Date Added, Title, URL, Count).\n");
    print("-dload ........... Parse for Downloaded items (Download Ended, Source, Target, Current No. Bytes).\n");
    print("-hist ............ Parse for History (Date Visited, Title, URL, Count).\n");
    print("\nExample: ffparser.pl -path /cases/firefox/ -bk -dload -hist");
    print("\nNote: Trailing / at end of path\n");
    exit;
}

# For now, ass-ume downloads.sqlite, places.sqlite are in the path provided
# Also, ass-ume that the path has a trailing "/" eg TAB autocompletion used
print "Running $version\n";

# Try read-only opening "places.sqlite" to extract the Big Endian 4 byte SQLite Version number at bytes 96-100
# The version number will be in the form (X*1000000 + Y*1000 + Z)
# where X is the major version number (3 for SQLite3), Y is the minor version number and Z is the release number
# eg 3007004 for 3.7.4
my $placesver=0;
open(my $placesfile, "<".$path."places.sqlite") || die("Unable to open places.sqlite for version retrieval\n");
#binmode($placesfile);
seek ($placesfile, 96, 0);
sysread ($placesfile, $placesver, 4)|| die("Unable to read places.sqlite for version retrieval\n");;
# Treat the 4 bytes as a Big Endian Integer
my $placesversion = unpack("N", $placesver);
print("\nplaces.sqlite SQLite Version is: $placesversion\n");
close($placesfile);

# Extract/Print the SQLite version number for downloads.sqlite as well
my $dloadsever=0;
open(my $dloadsfile, "<".$path."downloads.sqlite") || die("Unable to open downloads.sqlite for version retrieval\n");
#binmode($dloadsfile);
seek ($dloadsfile, 96, 0);
sysread ($dloadsfile, $dloadsever, 4)|| die("Unable to read downloads.sqlite for version retrieval\n");;
# Treat the 4 bytes as a Big Endian Integer
my $dloadsversion = unpack("N", $dloadsever);
print("downloads.sqlite SQLite Version is: $dloadsversion\n");
close($dloadsfile);

# Open the places.sqlite database file first
if ($bk or $hist)
{
    my $db = DBI->connect("dbi:SQLite:dbname=$path"."places.sqlite","","") || die( "Unable to connect to database\n" );
   
    # Checks if this is a valid Firefox places.sqlite
    $db->prepare("SELECT id FROM moz_places LIMIT 1") || die("The database is not a correct Firefox database".$db->errstr);

    if ($bk)
    {
        print "\nNow Retrieving Bookmarks ...\n";

        my $sth =  $db->prepare("SELECT datetime(moz_bookmarks.dateAdded/1000000, 'unixepoch') AS \'Date Added\', moz_bookmarks.title AS Title, moz_places.url AS URL, moz_places.visit_count AS Count FROM moz_bookmarks, moz_places WHERE moz_places.id = moz_bookmarks.fk ORDER BY moz_bookmarks.dateAdded ASC");
   
        $sth->execute();

        print $sth->{NUM_OF_FIELDS}." fields will be returned\n";
        PrintHeadings($sth);
        PrintResults($sth);

        # We print out the no. rows now because apparently $sth->rows isn't set until AFTER
        #  $sth->fetchrow_array() has completed in PrintResults
        if ($sth->rows == 0)
        {
            print "No Bookmarks found!\n\n";
        }
        else
        {   
            print $sth->rows." Rows returned\n";
        }
        $sth->finish;
    }

    if ($hist)
    {
        print "\nNow Retrieving History ...\n";

        my $sth =  $db->prepare("SELECT datetime(moz_historyvisits.visit_date/1000000, 'unixepoch') AS \'Date Visited\', moz_places.title AS Title, moz_places.url AS URL, moz_places.visit_count AS Count FROM moz_historyvisits, moz_places WHERE moz_historyvisits.place_id = moz_places.id ORDER BY moz_historyvisits.visit_date ASC");
   
        $sth->execute();

        print $sth->{NUM_OF_FIELDS}." fields will be returned\n";
        PrintHeadings($sth);
        PrintResults($sth);

        if ($sth->rows == 0)
        {
            print "No History found!\n\n";
        }
        else
        {   
            print $sth->rows." Rows returned\n";
        }

        $sth->finish;
    }

    $db->disconnect;
}

if ($dload)
{
    # Now we open the downloads.sqlite database file
    print "\nNow Retrieving Downloads ...\n";

    my $db = DBI->connect("dbi:SQLite:dbname=$path"."downloads.sqlite","","") || die( "Unable to connect to database\n" );

    # No further checks, we go straight into our query because it IS possible to have an empty moz_downloads table.
    my $sth =  $db->prepare("SELECT datetime(endTime/1000000, 'unixepoch') AS \'Download Ended\', source AS Source, target AS Target,  currBytes as \'Current No. Bytes\' FROM moz_downloads ORDER BY moz_downloads.endTime ASC");
   
    $sth->execute();

    print $sth->{NUM_OF_FIELDS}." fields will be returned\n";
    PrintHeadings($sth);
    PrintResults($sth);

    if ($sth->rows == 0)
    {
        print "No Downloads found!\n\n";
    }
    else
    {   
        print $sth->rows." Rows returned\n";
    }

    $sth->finish;

    $db->disconnect;
}

# end main

sub PrintHeadings
{
    my $sth = shift;

    # Print field headings
    for (my $i = 0; $i <= $sth->{NUM_OF_FIELDS}-1; $i++)
    {
        if ($i == $sth->{NUM_OF_FIELDS} - 1)
        {
            print $sth->{NAME}->[$i]."\n"; #last item adds a newline char
        }
        else
        {   
            print $sth->{NAME}->[$i]." | ";
        }
    }
}

sub PrintResults
{
    my $sth = shift;
    my @rowarray;

    # Prints row by row / field by field
    while (@rowarray = $sth->fetchrow_array() )
    {
        for (my $i = 0; $i <= $sth->{NUM_OF_FIELDS}-1; $i++)
        {
            if ($i == $sth->{NUM_OF_FIELDS} - 1 )
            {
                print $rowarray[$i]."\n"; #last field in row adds newline
            }
            else
            {
                if ($rowarray[$i])
                {
                    print $rowarray[$i]." | ";
                }
                else
                {
                    print " | "; # field returned could be UNDEFINED, just print separator
                }
            }
        }
    }
}

#END CODE


Coding Summary

So after starting off with the usual GetOptions command line parsing and the Help message sections, we try to extract the SQLite version number from "places.sqlite" file in the user nominated path. After I had trouble using sqlite3 (V3.6.16) to read Firefox 11 ".sqlite" files, I decided to print out the SQLite version in this script. It can't hurt to know the version of the SQLite database file you're investigating eh?
According to the file format described here, this version number will be at bytes 96 to 100 inclusive. It will be in the form X00Y00Z as described here eg 3007004 represents 3.7.4.
So we "open" the file for reading only (using the "<" argument), we "seek" to byte 96, then call "sysread" to store the 4 bytes in local variable called "$placesver". Our extraction isn't quite complete because Perl isn't sure how to interpret "$placesver". So we need to call "unpack" with the parameter "N" to tell Perl to interpret this variable as a 32 bit (or 4 byte) Big Endian Integer. See the section marked "How to Eat an Egg on a Net" here for further details. Once we have unpacked our number into a separate local variable ("$placesversion"), we print it to the command line and close the "places.sqlite" file.

We repeat this process for the "downloads.sqlite" file.

The next 3 sections are based on the following theme:
- Call "DBI->connect" to open/connect to the relevant ".sqlite" file
- Call "prepare" to prepare an SQL SELECT statement
- Call "execute" for that statement
- Print out the number of fields returned by the statement
- Print out the Title Headers for each field/column (via our "PrintHeaders" function)
- Print out each row of results returned by the statement (via our "PrintResults" function)
- Call "finish" to free up any statement allocated resources
- Call "disconnect" to hang up the connection with the ".sqlite" file

The hearts of the whole program are the 3 SQL queries. The first 2 are run against the "places.sqlite" file.

This is the Bookmark retrieval query:
"SELECT datetime(moz_bookmarks.dateAdded/1000000, 'unixepoch') AS \'Date Added\', moz_bookmarks.title AS Title, moz_places.url AS URL, moz_places.visit_count AS Count FROM moz_bookmarks, moz_places WHERE moz_places.id = moz_bookmarks.fk ORDER BY moz_bookmarks.dateAdded ASC"

Its basically saying: "Match up the table entries where the moz_places.id equals the moz_bookmarks.fk field and return a human readable version of the moz_bookmarks.dateAdded field, the moz_bookmarks.title field, the moz_places.url and moz_places.visit_count".
The use of the keyword AS indicates that we wish to return result names under an alias. For example, instead of returning the name "moz.place.url" and its value, the query returns "URL" and the value.
"Digital Forensics With Open Source Tools" (Altheide and Carvey) p. 150 has more details on how we determine the human readable time format.

This is the History retrieval query:
"SELECT datetime(moz_historyvisits.visit_date/1000000, 'unixepoch') AS \'Date Visited\', moz_places.title AS Title, moz_places.url AS URL, moz_places.visit_count AS Count FROM moz_historyvisits, moz_places WHERE moz_historyvisits.place_id = moz_places.id ORDER BY moz_historyvisits.visit_date ASC"

This statement is *very* similar to the one explained in "Digital Forensics With Open Source Tools" (Altheide and Carvey) p.150 except it also returns the "moz_places.visit_count" field as well.

Finally, the Downloads query is run against the "downloads.sqlite" file:
"SELECT datetime(endTime/1000000, 'unixepoch') AS \'Download Ended\', source AS Source, target AS Target,  currBytes as \'Current No. Bytes\' FROM moz_downloads ORDER BY moz_downloads.endTime ASC"

All 3 of these statements are run depending on the user's command line flags (eg -bk -hist -dload).

Testing

OK lets see how we went reading both Firefox 3.5.17 and 11 ".sqlite" files shall we?

I've been running SIFT's Firefox 3.5.17 in non-"Private Browsing" mode for a while now. So I copied the ".sqlite" files into "/home/sansforensics/ffexamples/" and then ran our script against the copied examples.

First we try analysing the Firefox 3.5.17 History:

sansforensics@SIFT-Workstation:~$ ffparser.pl -path ffexamples/ -hist
Running ffparser.pl v2012-03-19

places.sqlite SQLite Version is: 3007004
downloads.sqlite SQLite Version is: 3007004

Now Retrieving History ...
4 fields will be returned
Date Visited | Title | URL | Count
2011-11-16 20:37:31 | SANS Computer Forensics Training, Incident Response | http://computer-forensics.sans.org/ | 27
2011-11-16 20:57:56 | SANS Computer Forensics Training, Incident Response | http://computer-forensics.sans.org/ | 27
2011-11-16 21:02:08 | SANS Computer Forensics Training, Incident Response | http://computer-forensics.sans.org/ | 27
2011-12-05 20:06:26 | SANS Computer Forensics Training, Incident Response | http://computer-forensics.sans.org/ | 27
2011-12-05 20:06:37 | blog | http://computer-forensics.sans.org/blog | 4
2011-12-11 05:56:16 | SANS Computer Forensics Training, Incident Response | http://computer-forensics.sans.org/ | 27
... (Edited out)
2012-03-17 06:13:16 | Google | http://www.google.com.au/ | 75
2012-03-17 06:13:22 | SQLite::DB - search.cpan.org | http://search.cpan.org/~vxx/SQLite-DB-0.04/lib/SQLite/DB.pm | 2
2012-03-17 06:18:19 | ff3histview | http://blog.kiddaland.net/dw/ff3histview | 3
1472 Rows returned
sansforensics@SIFT-Workstation:~$


Phew! Nothing too embarrassing there ;) 1472 rows though - I gotta clear that history more often!
Hint: You can also run "ffparser.pl -path ffexamples/ -hist | more" so you can actually see the results before they scroll off the screen.
Also note, the SQLite version info for Firefox 3.5.17 is 3.7.4.

Next, we try looking at the Firefox 3.5.17 Bookmarks:

sansforensics@SIFT-Workstation:~$ ffparser.pl -path ffexamples/ -bk
Running ffparser.pl v2012-03-19

places.sqlite SQLite Version is: 3007004
downloads.sqlite SQLite Version is: 3007004

Now Retrieving Bookmarks ...
4 fields will be returned
Date Added | Title | URL | Count
2007-06-06 11:38:27 | Ubuntu | http://www.ubuntulinux.org/ | 0
2007-06-06 11:38:27 | Ubuntu Wiki (community-edited website) | http://www.ubuntulinux.org/wiki/FrontPage | 1
2007-06-06 11:38:27 | Make a Support Request to the Ubuntu Community | https://launchpad.net/distros/ubuntu/+addticket
... (Edited out)
2012-02-02 01:48:44 | Perl - Wikipedia, the free encyclopedia | http://en.wikipedia.org/wiki/Perl | 15
2012-02-14 06:26:14 | ExifTool by Phil Harvey | http://www.sno.phy.queensu.ca/~phil/exiftool/ | 4
2012-02-14 11:10:51 | The Perl Programming Language - www.perl.org | http://www.perl.org/ | 18
2012-02-15 12:59:30 | perlintro - perldoc.perl.org | http://perldoc.perl.org/perlintro.html | 11
2012-02-15 23:23:41 | perlstyle - perldoc.perl.org | http://perldoc.perl.org/perlstyle.html | 2
2012-02-29 05:06:19 | The CPAN Search Site - search.cpan.org | http://search.cpan.org/ | 18
2012-03-04 09:05:41 | Ubuntu Forums | http://ubuntuforums.org/ | 1
2012-03-16 11:37:26 | PerlMonks - The Monastery Gates | http://perlmonks.org/? | 1
2012-03-16 11:40:23 | Stack Overflow | http://stackoverflow.com/ | 1
33 Rows returned
sansforensics@SIFT-Workstation:~$


OK, now let's check the SIFT's Firefox 3.5.17 Downloads history.

sansforensics@SIFT-Workstation:~$ ffparser.pl -path ffexamples/ -dload
Running ffparser.pl v2012-03-19

places.sqlite SQLite Version is: 3007004
downloads.sqlite SQLite Version is: 3007004

Now Retrieving Downloads ...
4 fields will be returned
Download Ended | Source | Target | Current No. Bytes
2012-02-14 06:15:02 | http://blob.perl.org/books/impatient-perl/iperl.pdf | file:///root/Desktop/iperl.pdf | 480850
2012-02-14 06:39:42 | http://owl.phy.queensu.ca/~phil/exiftool/Vodafone.tar.gz | file:///root/Desktop/Vodafone.tar.gz | 61366
2012-02-14 06:40:43 | http://owl.phy.queensu.ca/~phil/exiftool/Vodafone.tar.gz | file:///root/Documents/Vodafone.tar.gz | 61366
... (Edited out)
2012-03-16 12:06:29 | http://www.forensicswiki.org/w/images/d/d0/Firefox3_places_relationship_schema.png | file:///cases/Firefox3_places_relationship_schema.png | 24481
21 Rows returned
sansforensics@SIFT-Workstation:~$


Everything looks cool so far eh? As a check, I used the SQLite Manager plugin to view these same files and saw that randomly selected entries matched the script.

Now let's try using our script with Firefox 11 files in "Private Browsing" mode. For this part, I browsed using WinXP Firefox 11 and downloaded a copy of ProcMon. I then copied the ".sqlite" files over to SIFT's "/cases/firefox11-prv/" directory.

First, lets see if we can find any Firefox 11 "Private Browsing" History info:

sansforensics@SIFT-Workstation:~$ ffparser.pl -path /cases/firefox11-prv/ -hist
Running ffparser.pl v2012-03-19

places.sqlite SQLite Version is: 3007007
downloads.sqlite SQLite Version is: 3007007

Now Retrieving History ...
4 fields will be returned
Date Visited | Title | URL | Count
No History found!

sansforensics@SIFT-Workstation:~$


No real surprise there - the history appears to be empty.
Also note, the SQLite version info for Firefox 11 is 3.7.7 (ie a later version than Firefox 3.5.17's).

Lets look at the Firefox 11 "Private Browsing" Bookmarks info:

sansforensics@SIFT-Workstation:~$ ffparser.pl -path /cases/firefox11-prv/ -bk | more
Running ffparser.pl v2012-03-19

places.sqlite SQLite Version is: 3007007
downloads.sqlite SQLite Version is: 3007007

Now Retrieving Bookmarks ...
4 fields will be returned
Date Added | Title | URL | Count
2009-03-08 16:26:34 | NFL.com - Official Site of the National Football League | http://www.nfl.com/ | 0
2009-03-08 16:29:28 | The National Football Post | http://www.nationalfootballpost.com/ | 0
2009-03-08 16:32:05 | Breaking news, real-time scores and daily analysis from Sports Illustrated – SI.com | http://sportsillustrated.cnn.com/ |
0
... (edited)
2012-02-13 18:09:28 | ExifTool by Phil Harvey | http://www.sno.phy.queensu.ca/~phil/exiftool/index.html | 0
2012-02-20 09:29:53 | The Perl Programming Language - www.perl.org | http://www.perl.org/ | 0
2012-02-20 09:30:10 | The CPAN Search Site - search.cpan.org | http://search.cpan.org/ | 0
473 Rows returned
sansforensics@SIFT-Workstation:~$


As expected, the Bookmark information is retained regardless of browser mode.
Now, lets see whats in the Firefox 11 "Private Browsing" Downloads info:

sansforensics@SIFT-Workstation:~$ ffparser.pl -path /cases/firefox11-prv/ -dload
Running ffparser.pl v2012-03-19

places.sqlite SQLite Version is: 3007007
downloads.sqlite SQLite Version is: 3007007

Now Retrieving Downloads ...
4 fields will be returned
Download Ended | Source | Target | Current No. Bytes
No Downloads found!

sansforensics@SIFT-Workstation:~$


No real surprises there either - the Downloads history also appears empty in "Private Browsing" mode.

And finally(!!!), here's the results of our script with Firefox 11 in non-"Private Browsing" mode. For this part, I browsed using WinXP Firefox 11 and downloaded a copy of ProcMon. I then copied the ".sqlite" files over to SIFT's "/cases/firefox11/" directory.

Here's the Firefox 11 in non-"Private Browsing" History info:

sansforensics@SIFT-Workstation:~$ ffparser.pl -path /cases/firefox11/ -hist
Running ffparser.pl v2012-03-19

places.sqlite SQLite Version is: 3007007
downloads.sqlite SQLite Version is: 3007007

Now Retrieving History ...
4 fields will be returned
Date Visited | Title | URL | Count
2012-03-19 11:54:15 | Google | http://www.google.com.au/ | 1
2012-03-19 11:55:05 | Windows Sysinternals: Documentation, downloads and additional resources | http://technet.microsoft.com/en-us/sysinternals/default.aspx | 1
2012-03-19 11:56:00 | Sysinternals Process Utilities | http://technet.microsoft.com/en-us/sysinternals/bb795533 | 1
2012-03-19 11:56:33 | Process Monitor | http://technet.microsoft.com/en-us/sysinternals/bb896645 | 1
2012-03-19 11:56:39 | ProcessMonitor.zip | http://download.sysinternals.com/files/ProcessMonitor.zip | 0
5 Rows returned
sansforensics@SIFT-Workstation:~$


And the Firefox 11 in non-"Private Browsing" Bookmark info:

sansforensics@SIFT-Workstation:~$ ffparser.pl -path /cases/firefox11/ -bk | more
Running ffparser.pl v2012-03-19

places.sqlite SQLite Version is: 3007007
downloads.sqlite SQLite Version is: 3007007

Now Retrieving Bookmarks ...
4 fields will be returned
Date Added | Title | URL | Count
2009-03-08 16:26:34 | NFL.com - Official Site of the National Football League | http://www.nfl.com/ | 0
2009-03-08 16:29:28 | The National Football Post | http://www.nationalfootballpost.com/ | 0
2009-03-08 16:32:05 | Breaking news, real-time scores and daily analysis from Sports Illustrated – SI.com | http://sportsillustrated.cnn.com/ |
0
... (edited)
2012-02-13 18:09:28 | ExifTool by Phil Harvey | http://www.sno.phy.queensu.ca/~phil/exiftool/index.html | 0
2012-02-20 09:29:53 | The Perl Programming Language - www.perl.org | http://www.perl.org/ | 0
2012-02-20 09:30:10 | The CPAN Search Site - search.cpan.org | http://search.cpan.org/ | 0
473 Rows returned
sansforensics@SIFT-Workstation:~$


Which unsurprisingly is the same results as for the "Private Browsing" Bookmarks section detailed above.

And finally, the Firefox 11 in non-"Private Browsing" Downloads output:

sansforensics@SIFT-Workstation:~$ ffparser.pl -path /cases/firefox11/ -dload
Running ffparser.pl v2012-03-19

places.sqlite SQLite Version is: 3007007
downloads.sqlite SQLite Version is: 3007007

Now Retrieving Downloads ...
4 fields will be returned
Download Ended | Source | Target | Current No. Bytes
2012-03-19 11:56:57 | http://download.sysinternals.com/files/ProcessMonitor.zip | file:///C:/ProcessMonitor.zip | 1324628
1 Rows returned
sansforensics@SIFT-Workstation:~$


Pretty cool - we can see where we went to the SysInternals website and downloaded ProcessMonitor.zip.

Some Interesting(?) Side Notes:

If you type:
about:cache
into the Firefox address bar (both 3.6 and 11), Firefox displays the current cache contents via clickable html-like interface. Note: "Private Browsing" automatically clears the cache when Firefox is closed.

With Firefox 11 in "Private Browsing"/"Forget History" mode, CCleaner v3.14.1616 will detect and remove "downloads.sqlite", "content-prefs.sqlite", "formhistory.sqlite".
With Firefox 11 in non-"Private Browsing"/"Remember History" mode, CCleaner v3.14.1616 will detect and remove the cookies (but not the "cookies.sqlite" file), the user profile cache and the following sqlite files - "downloads.sqlite", "content-prefs.sqlite", "formhistory.sqlite", "signons.sqlite". Depending on what sites are visited, some of the ".sqlite" files may not be created (eg signons.sqlite).

If you (masochistically) want to view the Firefox source code, its available here.

And we're done!
If you're still reading, it's time to grab a beverage ;)
Or if you skipped ahead to get the summary (you missed out on the fun!) - we have successfully used Perl::DBI to read Firefox 3.5.17 and 11 ".sqlite" files for history, bookmark and download information. Reading other software's ".sqlite" files will follow a similar process.
Thanks must go to Kristinn Gudjonsson for providing his original "ff3histview" script upon which this particular house of cards is based.
Hopefully, you've gotten something out of this uber long post - there is a lot of information out there and if in doubt, I  added it in here.

Comments/Suggestions are welcome! I think I'll leave any SQLite carving attempts for now though - it seems a little daunting for this monkey.

Saturday 10 March 2012

Quick Tutorial On Re-using My Perl Scripts

Hi All,

What a busy week for this little monkey!
A fellow monkey recently contacted me about some problems they had getting my "exif2map.pl" script to work on SIFT. Specifically, they were getting "command not found" errors whenever they launched their version.

I thought I'd write a quick guide just in case anyone else was having issues. This procedure applies to all of my Perl programs I have posted on this blog.

I'll use my latest "exif2map.pl" as an example:

# START CODE


#!/usr/bin/perl -w

# Perl script to take the output of exiftool and conjure up a web link
# to google maps if the image has stored GPS lat/long info.

use strict;

use Image::ExifTool;
use Image::ExifTool::Location;
use Getopt::Long;
use HTML::QuickTable;
use File::Find;

# commented out for now - apparently File:Find can issue some weird warnings
#no warnings 'File::Find';

my $version = "exif2map.pl v2012.02.21";
my $help = ''; # help flag
my $htmloutput = ''; #html flag
my @filenames; # input files from -f flag
my @directories; # input directories from -dir flag (must use absolute paths)

my %file_listing; # stored results

GetOptions('help|h' => \$help,
    'html' => \$htmloutput,
    'f=s@' => \@filenames,
    'dir=s@' => \@directories);

if ($help||(@filenames == 0 && @directories == 0))
{
    print("\n$version\n");
    print("Perl script to take the output of exiftool and conjure up a web link\n");
    print("to google maps if the image has stored GPS lat/long info.\n");

    print("\nUsage: exif2map.pl [-h|help] [-f filename] [-html]\n");
    print("-h|help .......... Help (print this information). Does not run anything else.\n");
    print("-f filename ...... File(s) to extract lat/long from\n");
    print("-dir directory ... Absolute path to folder containing file(s) to extract lat/long from\n");
    print("-html ............ Also output results as a timestamped html file in current directory\n");

    print("\nExample: exif2map.pl -f /cases/galloping-gonzo.jpg");
    print("\nExample: exif2map.pl -f /cases/krazy-kermit.jpg -dir /cases/rockin-rowlf-pics/ -html\n\n");
    print("Note: Outputs results to command line and (if specified) to a timestamped html file\n");
    print("in the current directory (e.g. exif2map-output-TIMESTAMP.html)\n\n");
   
    exit;
}

# Main processing loop
print("\n$version\n");

# Process filenames specified using the -f flag first
if (@filenames)
{
    foreach my $name (@filenames)
    {
        ProcessFilename($name);
    }
}

# Process folders specified using the -dir flag
# Note: Will NOT follow symbolic links to files
if (@directories)
{
    find(\&ProcessDir, @directories);
}

# If html output required AND we have actually retrieved some data ...
if ( ($htmloutput) && (keys(%file_listing) > 0) )
{   
    #timestamped output filename
    my $htmloutputfile = "exif2map-output-".time.".html";

    open(my $html_output_file, ">".$htmloutputfile) || die("Unable to open $htmloutputfile for writing\n");

    my $htmltable = HTML::QuickTable->new(border => 1, labels => 1);

    # Added preceeding "/" to "Filename" so that the HTML::QuickTable sorting doesn't result in
    # the column headings being re-ordered after / below a filename beginning with a "\".
    $file_listing{"/Filename"} = "GoogleMaps Link";

    print $html_output_file "<HTML>";
    print $html_output_file $htmltable->render(\%file_listing);
    print $html_output_file "<\/HTML>";

    close($htmloutputfile);
    print("\nPlease refer to \"$htmloutputfile\" for a clickable link output table\n\n");
}

sub ProcessFilename
{
    my $filename = shift;

    if (-e $filename) #file must exist
    {
        my $exif = Image::ExifTool->new();
        # Extract all info from existing image
        if ($exif->ExtractInfo($filename))
        {
            # Ensure all 4 GPS params are present
            # ie GPSLatitude, GPSLatitudeRef, GPSLongitude, GPSLongitudeRef
            # The Ref values indicate North/South and East/West
            if ($exif->HasLocation())
            {
                my ($lat, $lon) = $exif->GetLocation();
                print("\n$filename contains Lat: $lat, Long: $lon\n");
                print("URL: http://maps.google.com/maps?q=$lat,+$lon($filename)&iwloc=A&hl=en\n");
                if ($htmloutput) # save GoogleMaps URL to global hashmap indexed by filename
                {
                    $file_listing{$filename} = "<A HREF = \"http://maps.google.com/maps?q=$lat,+$lon($filename)&iwloc=A&hl=en\"> http://maps.google.com/maps?q=$lat,+$lon($filename)&iwloc=A&hl=en</A>";
                }
                return 1;
            }
            else
            {
                print("\n$filename : No Location Info available!\n");
                return 0;
            }
        }
        else
        {
            print("\n$filename : Cannot Extract Info!\n");
            return 0;
        }
    }
    else
    {
        print("\n$filename does not exist!\n");
        return 0;
    }
}

sub ProcessDir
{
    # $File::Find::dir is the current directory name,
    # $_ is the current filename within that directory
    # $File::Find::name is the complete pathname to the file.
    my $filename = $File::Find::name; # should contain absolute path eg /cases/pics/krazy-kermit.jpg

    if (-f $filename) # must be a file not a directory name ...
    {
        my $exif = Image::ExifTool->new();
        # Extract all info from existing image
        if ($exif->ExtractInfo($filename))
        {
            # Ensure all 4 GPS params are present
            # ie GPSLatitude, GPSLatitudeRef, GPSLongitude, GPSLongitudeRef
            # The Ref values indicate North/South and East/West
            if ($exif->HasLocation())
            {
                my ($lat, $lon) = $exif->GetLocation();
                print("\n$filename contains Lat: $lat, Long: $lon\n");
                print("URL: http://maps.google.com/maps?q=$lat,+$lon($filename)&iwloc=A&hl=en\n");
                if ($htmloutput) # save GoogleMaps URL to global hashmap indexed by filename
                {
                    $file_listing{$filename} = "<A HREF = \"http://maps.google.com/maps?q=$lat,+$lon($filename)&iwloc=A&hl=en\"> http://maps.google.com/maps?q=$lat,+$lon($filename)&iwloc=A&hl=en</A>";
                }
                return 1;
            }
            else
            {
                print("\n$filename : No Location Info available!\n");
                return 0;
            }
        }
        else
        {
            print("\n$filename : Cannot Extract Info!\n");
            return 0;
        }
    }
}

# END CODE


1. OK, what I should have mentioned was that the very first line in your "exif2map.pl" script is the line containing " #!/usr/bin/perl -w".
This line tells Unix, "Hey make sure you use the Perl interpreter on this script". It's GOT to be the first thing Unix sees in the script. Otherwise, you will probably get a bunch of "command not found" errors. The "#START CODE" and "#END CODE" comments aren't supposed to be in the script file.

2. Copy/paste everything from that first line down to just BEFORE the "# END CODE".

3. According to this article it might be best if you don't try saving it into a Windows text editor if you intend to eventually run it on SIFT. You might be better off, using Firefox / gedit on SIFT. The reason being, the Windows text editor may save it using an incorrect character set / add extra control characters. Probably unlikely to happen but documented here just in case.

4. After saving the script (preferably in "/usr/local/bin/"), don't forget to make it executable using "chmod a+x /usr/local/bin/exif2map.pl".

5. For some reason, when I used CPAN to install Image::ExifTool::Location, the module seemed to download/install OK but then I got some testing failures. I am not sure why this is happening - maybe the version of CPAN on SIFT is not compatible with Image::ExifTool::Location's tests? Maybe the module is expecting a particular environment path? I don't know. The weird thing was that none of the other CPAN modules I have subsequently installed had any testing issues *shrug*.
Anyway, like the true hack I am - I decided to code it anyway and see if it worked afterwards :) Subsequently,  I couldn't see any problems after testing the "exif2map.pl" script against the ExifTool Firefox plugin. Popular Programmers Paradigm - if it works, don't touch it !

So that's the end of this "quick" little post. Hope this helps!