Sunday, May 2, 2010

Google Chrome Forensic


This post is actually posted in SANS computer forensic lab by Kristinn under Browser Forensics, Computer Forensics. This is pretty useful information about Google Chrome so i am linking it in here..
Google Chrome stores the browser history in a SQLite database, not unlike Firefox.  Yet the structure of the database file is quite different.
[ad#Google Adsense Text only 728-90]
Chrome stores its files in the following locations:
  • Linux: /home/$USER/.config/google-chrome/
  • Linux: /home/$USER/.config/chromium/
  • Windows Vista (and Win 7): C:Users[USERNAME]AppDataLocalGoogleChrome
  • Windows XP: C:Documents and Settings[USERNAME]Local SettingsApplication DataGoogleChrome
There are two different versions of Google Chrome for Linux, the official packets distributed by Google, which stores its data in the google-chrome directory and the Linux distributions version Chromium.
The database file that contains the browsing history is stored under the Default folder as “History” and can be examined using any SQLlite browser there is (such as sqlite3).  The available tables are:
  • downloads
  • presentation
  • urls
  • keyword_search_terms
  • segment_usage
  • visits
  • meta
  • segments
The most relevant tables for browsing history are the “urls” table that contains all the visited URLs, the “visits” table that contains among other information about the type of visit and the timestamps and finally the “downloads” table that contains a list of downloaded files.
If we examine the urls table for instance by using sqlite3 we can see:
sqlite> .schema urls
CREATE TABLE urls(id INTEGER PRIMARY KEY,url LONGVARCHAR,title LONGVARCHAR,visit_count INTEGER DEFAULT 0 NOT NULL,
typed_count INTEGER DEFAULT 0 NOT NULL,last_visit_time INTEGER NOT NULL,hidden INTEGER DEFAULT 0 NOT NULL,
favicon_id INTEGER DEFAULT 0 NOT NULL);
CREATE INDEX urls_favicon_id_INDEX ON urls (favicon_id);
CREATE INDEX urls_url_index ON urls (url);
And the visits table
sqlite> .schema visits
CREATE TABLE visits(id INTEGER PRIMARY KEY,url INTEGER NOT NULL,visit_time INTEGER NOT NULL,from_visit INTEGER,transition INTEGER DEFAULT 0 NOT NULL,segment_id INTEGER,is_indexed BOOLEAN);
CREATE INDEX visits_from_index ON visits (from_visit);
CREATE INDEX visits_time_index ON visits (visit_time);
CREATE INDEX visits_url_index ON visits (url);
So we can construct a SQL statement to get some information about user browser habit.
SELECT urls.url, urls.title, urls.visit_count, urls.typed_count, urls.last_visit_time, urls.hidden, visits.visit_time, visits.from_visit, visits.transition
FROM urls, visits
WHERE
 urls.id = visits.url
This SQL statement extracts all the URLs the user visited alongside the visit count, type and timestamps.
If we examine the timestamp information from the visits table we can see they are not constructed in an Epoch format.  The timestamp in the visit table is formatted as the number of microseconds since midnight UTC of 1 January 1601, which other have noticed as well, such as firefoxforensics.
If we take a look at the schema of the downloads table (.schema downloads) we see
CREATE TABLE downloads (id INTEGER PRIMARY KEY,full_path LONGVARCHAR NOT NULL,url LONGVARCHAR NOT NULL,
start_time INTEGER NOT NULL,received_bytes INTEGER NOT NULL,total_bytes INTEGER NOT NULL,state
And examine the timestamp there (the start_time) we can see that it is stored in Epoch format.
There is one more interesting thing to mention in the “visits” table.  It is the row “transition”.  This value describes how the URL was loaded in the browser.  For full documentation see the source code of page_transition_types or in a shorter version the core parameters are the following:
  • LINK. User go to the page by clicking a link.
  • TYPED. User typed the URL in the URL bar.
  • AUTO_BOOKMARK. User got to this page through a suggestion in the UI, for example,through the destinations page
  • AUTO_SUBFRAME. Any content that is automatically loaded in a non-toplevel frame. User might not realize this is a separate frame so he might not know he browsed there.
  • MANUAL_SUBFRAME. For subframe navigations that are explicitly requested by the user and generate new navigation entries in the back/forward list.
  • GENERATED. User got to this page by typing in the URL bar and selecting an entry that did not look like a URL.
  • START_PAGE. The user’s start page or home page or URL passed along the command line (Chrome started with this URL from the command line)
  • FORM_SUBMIT. The user filled out values in a form and submitted it.
  • RELOAD.  The user reloaded the page, whether by hitting reload, enter in the URL bar or by restoring a session.
  • KEYWORD. The url was generated from a replaceable keyword other than the default search provider
  • KEYWORD_GENERATED. Corresponds to a visit generated for a keyword.
The transition variable contains more information than just the core parameters.  It also stores so called qualifiers such as whether or not this was a client or server redirect and if this a beginning or an end of a navigation chain.
When reading the transition from the database and extracting just the core parameter the variable CORE_MASK has to be used to AND with the value found inside the database.
CORE_MASK = 0xFF,
I’ve created an input module to log2timeline to make things a little bit easier by automating this.  At this time the input module is only available in the nightly builds, but it will be released in version 0.41 of the framework.
An example usage of the script is the following:
log2timeline -f chrome -z local History
0|[Chrome] User: kristinng URL visited: http://tools.google.com/chrome/intl/en/welcome.html (Get started with Google Chrome) [count: 1] Host: tools.google.com type: [START_PAGE - The start page of the browser] (URL not typed directly)|0|0|0|0|0|1261044829|1261044829|1261044829|1261044829
0|[Chrome] User: kristinng URL visited: http://isc.sans.org/ (SANS Internet Storm Center; Cooperative Network Security Community - Internet Security) [count: 1] Host: isc.sans.org type: [TYPED - User typed the URL in the URL bar] (directly typed)|0|0|0|0|0|1261044989|1261044989|1261044989|1261044989..
The script reads the user name from the directory path the history file was found and then reads the database structure from the History file and prints out the information in a human readable form (this output is in mactime format).  To convert the information found here in CSV using mactime
log2timeline -f chrome -z local History > bodyfile
mactime -b bodyfile -d > body.csv
And the same lines in the CSV file are then:
Thu Dec 17 2009 10:13:49,0,macb,0,0,0,0,[Chrome] User: kristinng URL visited: http://tools.google.com/chrome/intl/en/welcome.html (Get started with Google Chrome) [count: 1] Host: tools.google.com type: [START_PAGE - The start page of the browser] (URL not typed directly)
Thu Dec 17 2009 10:16:29,0,macb,0,0,0,0,[Chrome] User: kristinng URL visited: http://isc.sans.org/ (SANS Internet Storm Center; Cooperative Network Security Community - Internet Security) [count: 1] Host: isc.sans.org type: [TYPED - User typed the URL in the URL bar] (directly typed)
Original information is published in this link