patternoflife

Guest Post by @bizzybarney! A Peek Inside the PPSQLDatabase.db Personalization Portrait Database

The DFIR Twitter-sphere exploded this morning when @mattiaep mentioned /private/var/mobile/Library/PersonalizationPortrait/PPSQLDatabase.db. I’ve been doing some research work on this file and plan to present pieces of it during my talk at the upcoming SANS DFIR Summit. I reached out to @iamevltwin and asked if she would host a quick blog post and she graciously agreed - but I now owe her gin, and steak and cheese egg rolls. In all seriousness, a huge thank you to my good friend and mentor!

Check out the upcoming Summit agenda here.

The PersonalizationPortrait directory is native on iOS 13 and contains a few interesting files. One specifically, PPSQLDatabase.db, is loaded with data. From my research, the directory existed on iOS 12 but this database did not.  Some of the data is repetitive from other native locations, but there’s a lot of context that can be correlated from this database and pieces of this data might exist longer here than they do in other places. So what is the purpose of this database? Don’t forget as you poke at it, that it has ‘Personalization’ in its title. It resides in the native /Library directory, so it’s Apple. What are you up to Apple? 

My research on this is still ongoing, but one thing I know is Apple wants me to like my device and to feel comfortable while I’m constantly immersed in their operating system. When I open News I want to see certain things of interest. When I select a photo to send to a friend Apple gives me a short list of probable recipients. When I plug my car into CarPlay I appreciate when Apple weirdly predicts where I’m headed. Is this ‘Personalization’ just for Apple or are pieces of this filtering out to other parties for better advertisement tailoring? The Widgets screen to the left of my home screen on my iPhone is a dynamic representation of so many pieces of personal data. What’s next on my Calendar and where it’s occurring, which apps I recently used, News, Weather based on my location, where I last parked my car when using CarPlay, and ScreenTime totals I can’t hardly believe. 

I want to highlight a few pieces of the PPSQLDatabase.db file to show it’s potential, and pass along an appropriate amount of caution as well. This database is aggregating data from many sources and attribution must be done carefully.

The first table to peek into is loc_records. It is absolutely loaded with location data - mine had 1000 entries. The 1000 is curious, and I wonder if anyone else’s table is limited at 1000 or if mine just landed on that nice even number. @iamevltwin did a quick check and found 1000 as well on Mac and iOS, so it seems that is the limit for the table. In reviewing just the locations from this table without joining any other table data, I quickly recognize many of them as places I have definitely been with this device in my pocket. But…then there are those locations I definitely haven’t been, ever. So while having 1000 locations records to play with sounds like good fun, don’t forget to be diligent with attribution.

Most of these location records have a column named “clp_location” which is stored as BLOB data. At closer inspection the BLOB data is a binary plist (.bplist). 

Using plutil -p produces a more friendly output format to inspect, reminiscent of Cloud-V2.sqlite and other “Significant Location” files. But that doesn’t mean my device was at that place at that time, because I wasn’t in Las Vegas on June 2nd, 2020 which is what that timestamp converts to. I checked my photos based on the ‘Places’ feature, and I did take a photo at that location - but it was June 4th, 2019 at 5:35PM. Weird.

So why is a photo from a year ago popping up? Making a few joins and gathering a bit more data produces more context to this specific item. After hammering out a query to join it over to the ‘sources’ table, I am able to see that this entry is associated with ‘com.apple.mobileslideshow’ and more specifically ‘com.apple.proactive.PersonalizationPortrait.PhotosGraphDonation’. One word really jumps out there - ‘proactive.’ That makes me think Apple is doing something for me here, and without asking. I went to my Photos section and found a categorization of Las Vegas photos for June 2-7 in my ‘Months’ category, and the photo is in that album. As I scanned the other places in these ‘proactive’ listings I am also able to see photo groupings from a trip to Jamaica (Kingston) and to Huntington Beach, CA in their own categories. 

So you may be asking yourself at this point, “Why did I just read this? It’s a location in a photo.” Stick with me, and remember we are trying to figure out what this newly discovered file is doing and how far reaching it might be. 

A few lines down from the photos we have an address in New Jersey being attributed to com.apple.mobilemail. I tweaked the query to ‘localtime’ for the sake of presentation, but I received an email on May 31, 2020 at 11:17AM EDT regarding the June 2nd release of Cellebrite Physical Analyzer version 7.34. In the signature of that email was the US headquarters address for Cellebrite - 7 Campus Drive, Suite 210, Parsippany, NJ. 

This is just scratching the surface of a few items from one table with basic table joins made. Scanning other bundles represented here I can see some of my favorite places from Apple Maps, locations recently mentioned in iMessage conversations, and Calendar records of my standing 12:30PM Zoom meeting for Life has No Ctrl+Alt+Del. 

If you don’t have time to research but would like to hear more about it, tune in to my talk at the SANS DFIR Summit on July 16th! If you do, try out this query I used for this blog post to pry around at this one table from this database and let me know how it works on your data via Twitter @bizzybarney

select
 loc_records.id,
 	sources.bundle_id as "Bundle ID",
 	sources.group_id as "Group ID",
 	datetime(sources.seconds_from_1970, 'unixepoch') as "Source Time",
 	loc_records.cll_latitude_degrees || ", "|| loc_records.cll_longitude_degrees as "Coordinates",
 	loc_records.clp_name as "Name",
 	loc_records.clp_thoroughfare as "Road",
 	loc_records.clp_subThoroughfare as "Address #",
 	loc_records.clp_locality as "City",
 	loc_records.clp_subLocality as "Sub-locality",
 	loc_records.clp_administrativeArea as "Admin Area",
 	loc_records.clp_subAdministrativeArea as "Sub Admin Area",
 	loc_records.clp_postalCode as "Postal Code",
 	loc_records.clp_ISOcountryCode as "Country Code",
 	loc_records.clp_country as "Counrty",
 	hex (loc_records.clp_location) as "Location BLOB (hex)",
 	loc_records.extraction_os_build as "iOS Build Version",
 	loc_records.category as "Category",
 	loc_records.algorithm as "Algorithm",
 	loc_records.initial_score as "Initial Score",
 	CASE
 	WHEN loc_records.is_sync_eligible = 1 then "Yes"
	WHEN loc_records.is_sync_eligible = 0 then "No"
	end as "Sync Eligible"
from loc_records  
left join sources on loc_records.source_id=sources.id

New Year New APOLLO – Officially out of Beta iOS 13 Module Updates!

I spent this weekend updating and sprucing up APOLLO for its v1.0 release. It took far longer than anticipated, mostly because I’ve added quite a few new modules. It also takes a while to go through every SQL query module updating it to iOS 13. Database schemas change all the time, new tables, new columns, new everything!

I’ve done my best to test the new modules and a few of the updated older modules with iOS 13.1 and iOS 12.1.1. I have no doubts that some may work with iOS 11 as well but in order to keep on keepin’ on, I’ve decided to only test two major versions back. If you find one works for iOS 11 or older please let me know via Contact or submit a PR. If you happen to be working on an iOS 12 device I would definitely re-process for many new and updated modules! This brings the total modules to 165!

I will talk about a some of these in my upcoming Webinar for Blackbag on January 30 at 2pm EST.

Apple Health Modules

I have added imperial measurements to modules so us yanks can actually understand the measurements 😉:

New modules include:

Location Modules

Routined Databases changed a bit in iOS 13. Cloud.sqlite is now Cloud-V2.sqlite. Most modules did not need to be updated; however a new table was introduced ‘ZRTMAPITEMMO’. Modules were updated to include this data as well as a new separate Map Item module - routined_cloud_mapitem.

In the Cache.sqlite database, a new table was introduced, ‘ZRTVISITMO’ so a new module was created for it - routined_cache_zrvisitmo.

KnowledgeC Modules

The biggest update was with my favorite database knowledgeC.db.

Update modules include more context to the Application Activity modules. Some of the iOS native applications were split out into their own (they should also be parse in the generic knowledge_app_activity module.) Note: Calendar - was its own previously, however it was renamed to knowledge_app_activity_calendar to be consistent with the new modules below.

New modules include:

New Presentation from MacDevOpsYVR 2019 - Launching APOLLO: Creating a Simple Tool for Advanced Forensic Analysis

I had the pleasure last week to attend MacDevOpsYVR in Vancouver, Canada. While I barely saw the city, I got to hang out with some awesome Mac Sys Admins and Dev Ops people. I’ve not been to a conference outside of Security/Forensics before so it was a delight to see the types of presentations and insight these fine folks had to offer.

The presentation includes how my APOLLO project has evolved over the last few months since it was introduced in November, 2018. I also go though some of my real life pattern-of-life examples from my iOS 12 device. We talked about everything including to my health, moving bodies (and chopping them up!), taking selfies, and how much I will spend for good food. Once the video is released I will be sure to upload a link to it, it will certainly provide more (humourous) context to the slides. [Edit 06/18/2019 - Video here!]

A unique addition to normal conference presentations was the use of a graphic recorder (Ashton of Mind’s Eye Creative) to provide additional context to the presentations. She records in real time key points of each presentation and does an absolutely fantastic job at it. This allows for additional context for discussions after the presentation with fellow attendees. Example of my talk is below:

As always, my presentations are always available on my Resources page.

Direct Link to the presentation is here!

Apple Pattern of Life Lazy Output’er (APOLLO) Updates & 40 New Modules (Location, Chat, Calls, Apple Pay Transactions, Wallet Passes, Safari & Health Workouts)

I started filling in the gaps to missing APOLLO modules. While doing this I realized there was some capability that was missing with the current script that had to be updated. As far as script updates go the following was done:

  • Support for multiple database name -Depending on the iOS version being used the database names may be different but the SQL query itself is the same. Instead of creating many redundant modules I now have it looking for the different database filenames.

  • Support for multiple queries on different iOS versions- You will now notice that all the modules have been updated with iOS version indicators and multiple SQL queries compatible for that version. Some going back as far as iOS 8! I put in as much legacy support as I had data for. I will likely not add much more unless it is by special request. I can’t imagine there is a whole lot of iOS 8 analysis going on out there, but you never know! I have kept it to major iOS release numbers and have tested with the data I have but it may have changed with a minor point release, if you find this to be true please feel free to let me know!

  • Module Timestamp Rearrangement and Module Cleanup– I’ve started to go through some of the modules and move the items around to make it easier to see what is going on with each record. I’ve mostly just moved the timestamps toward the end since most of them are shown in the Key column. I’ve also removed some superfluous columns and extraneous junk in the queries. I’m really only trying to extract the most relevant data. 

A few notes on script usage change. The script flags have changed, with the added arguments of -p = platform (iOS support only for now), and -v = iOS Version. You may also notice the new ‘yolo’ option – this one will likely be error prone if you are not careful. Use this when you what to run it on any database from any platform. It can also be used with your own custom modules if you don’t have versioning in them.

An example of the module changes is below. Notice the multiple databases listed. In this example, the same location data can be extracted from the cache_encryptedB.db or the cache_encrypted A.db databases depending on the iOS versions. The version information is listed in the “VERSIONS” key, while the specific queries have versions listed in the [SQL Query …] brackets, this is the version that the apollo.py script is following.

The big updates were with the modules, lots of new support! I now have support for 129 different pattern-of-life items! Most of the support is for iOS, however if you run the queries themselves on similar macOS databases you will find that many of them will work. Better macOS support is coming, I promise.

Application Specific Usage:

  • Chat – SMS, iMessage, & FaceTime messages extracted from the sms.db database.

    • sms_chat

  • Call History – Extracted from CallHistory.storedata database.

    • call_history

  • Safari Browsing – Extracted from the History.db database.

    • safari_history

  • Apple Pay/Wallet - Extracted from iOS passes23.sqlite database.

    • Apple Pay Transactions - passes23_wallet_transactions

    • Wallet Passes - passes23_wallet_passes

 Location :

  • locationd - The following modules extract location data from the [lock]cache_encryptedA.db & cache_encryptedB.db databases. This will include various cellular and Wi-Fi based location tables as listed in the module filename.

    • locationd_cacheencryptedAB_appharvest

    • locationd_cacheencryptedAB_cdmacelllocation

    • locationd_cacheencryptedAB_celllocation

    • locationd_cacheencryptedAB_celllocationharvest

    • locationd_cacheencryptedAB_celllocationlocal

    • locationd_cacheencryptedAB_cmdacelllocationharvest

    • locationd_cacheencryptedAB_indoorlocationharvest

    • locationd_cacheencryptedAB_locationharvest

    • locationd_cacheencryptedAB_ltecelllocation

    • locationd_cacheencryptedAB_ltecelllocationharvest

    • locationd_cacheencryptedAB_ltecelllocationlocal

    • locationd_cacheencryptedAB_passharvest

    • locationd_cacheencryptedAB_poiharvestlocation

    • locationd_cacheencryptedAB_pressurelocationharvest

    • locationd_cacheencryptedAB_scdmacelllocation

    • locationd_cacheencryptedAB_wifilocation

    • locationd_cacheencryptedAB_wifilocationharvest

    • locationd_cacheencryptedAB_wtwlocationharvest

  • locationd – These modules extract motion data from the cache_encryptedC.db database. Not specific location data but will show device movement.

    • locationd_cacheencryptedC_motionstatehistory

    • locationd_cacheencryptedC_nataliehistory

    • locationd_cacheencryptedC_stepcounthistory

  • routined – Extracts location data from the cache_encryptedB.db database. If you have a keen eye you will notice the database name is the same as from ‘locationd’. Completely different database with different data stored in two different directories.

    • routined_cacheencryptedB_hint

    • routined_cacheencryptedB_location

Health Workouts – Using the healthdb_secure.sqlite database I’ve extracted much of the metadata from workouts. I’ve also determined some of the workout types (ie: HIIT, Rower, Run, Walk, etc), but have not enumerated all of them yet. Please let me know if you come across others – easier if you do this on your own data and can easily look it up. Same for weather conditions (Sunny, Rainy, etc.).

  • health_workout_elevation

  • health_workout_general

  • health_workout_humidity

  • health_workout_indoor

  • health_workout_location_latitude

  • health_workout_location_longitude

  • health_workout_temperature

  • health_workout_timeofday

  • health_workout_timezone

  • health_workout_weather

Network and Application Usage using netusage.sqlite & DataUsage.sqlite iOS Databases

Two iOS databases that I’ve always found interesting (and probably should test more) are netusage.sqlite and DataUsage.sqlite. These two databases contain very similar information – one is available in a backup (and file system dumps) the other only in file system dumps. These databases are excellent at tracking application and process network usage. 

These databases can provide answers to investigative questions such as:

  • What apps were being used?

  • What apps were used more than others?

  • Did the device communicate over cellular or wi-fi more often and when?

  • What apps were used that are no longer on the device?

These databases are located in the following locations depending on the type of acquisition available.

  • /private/var/networkd/netusage.sqlite

    • Available in File System dumps only.

  • Backup: /wireless/Library/Databases/

    • DataUsage.sqlite

    • DataUsage-watch.sqlite (yes, there is one just for the Apple Watch!)

  • File System: /private/var/wireless/Library/Databases/DataUsage.sqlite 

I’ve created modules for these databases in APOLLO, but you can also use the SQL queries in a standalone environment. I’m still working on how best to represent the timestamp keys and may alter the APOLLO code to accept multiple timestamp keys. This will help some other modules I’ve been working on as well so keep an eye out for that. I also need to work on acceptance of multiple database names, thanks to DataUsage-watch.sqlite.

The first set of modules are netusage_zprocess and datausage_zprocess. These two use the same SQL query as it is the same table, just different databases. These query extracts the process and/or the application bundle id. This query will show two timestamps:

  • TIMESTAMP – I believe this is the most recent timestamp for the process/application.

  • PROCESS FIRST TIMESTAMP – This appears to be the first usage of the process/application.

The first example comes from netusage.sqlite, the second from DataUsage.sqlite. It is notable to show that more information is available potentially from DataUsage.sqlite. Since this database is backed up it has the potential to have very historical data. These examples come from my iOS 11.1.2 dump. NetUsage goes back to November 4,2017 when I first setup iOS 11 on the device. The DataUsage database on the same device goes all the way back to 2013! This was from my iPhoneX which certainly did not exist in 2013. I restore backups onto new devices. I also get many more records from the DataUsage database.

netusage_zprocess

datausage_zprocess

The next set of queries are netusage_zliveproces and datausage_zliveprocess. These modules technically have a copy of the ZPROCESS data so they may be redundant if you are running APOLLO. Again, this is the same query for each database. DataUsage will again have many more entries. The added value from the ZPROCESS queries is the added network data information, Wi-Fi In/Out and WWAN In/Out. I will assume this value is stored in bytes until I can test further.

The major difference that I can tell between the two databases (apart from number of records), is that the DataUsage database does not record Wi-Fi network data. I know for sure I was on Wi-Fi at some point in the last six years! (It shows in NetUsage – remember it is the same device. Check out my Twitter data, its almost horrifying! 🤭)

netusage_zliveprocess

datausage_zliveprocess

Finally, we have an additional query only for the netusage.sqlite database, netusage_zliverouteperf. This query extracts lots of information, some of which I have no idea what it is. The first step into determining this is creating the query! In addition to some timestamps that appear to be stored on a per-hour basis we have the type of network traffic (Cellular or Wi-Fi), bytes and packings coming and going, connection information.

A second screenshot is required to show the rest of the extracted data. Some sort of cellular network identifier (any ideas?) or Wi-Fi (SSID/BSSID) are provided, with additional network-based information.

There is a lot of data going through these pipes!