Logging and Analysis with Your GPS-Enabled Phone

by flippy

Most new cell phones in the U.S. come with some form of GPS receiver.

While this addition does not necessarily enable widespread tracking of mobile phone users (your rough location could already be determined by which cell tower you are "attached" to, or via triangulation), it does potentially improve the accuracy with which someone can be tracked.

But this article will put most privacy concerns aside, dealing instead with what fun you can have with your GPS-enabled phone.  The analysis is obviously not phone-specific, but can use GPS coordinates from any device.  I mention the phone as it is more likely you will be carrying that with you.

The first step is to determine some way of logging GPS coordinates from your phone at a regular interval.  HP/Palm's webOS phones provide this capability through shell script accessible geolocation services and cron.1

You iPhone and Android users are on your own, but I am sure you can think of something.

Analysis

At this point, I will assume you have some kind of database with timestamped GPS coordinates for some time interval.  Now you can extract GPS coordinates and analyze the location data.

The first obvious thing you can do is analyze how much time you spend where.

The GPS positions always come with some uncertainty (you are recording the errors on your position, right??).  So, define a latitude/longitude (and altitude if you really want)2 for each location you want to monitor, along with a radius within which you consider that as part of the location.  As you crawl through your GPS location, compute the distance from each logged GPS location to each of your defined points of interest.  If a GPS entry is within the location circle, mark it as that location for that time interval.

To calculate the distance between a GPS coordinate and a location of interest, use the Haversine formula.3

The Earth is not perfectly round, but the Haversine formula and the mean radius of the Earth should provide sufficient accuracy for most needs (if you need more accuracy, shame on you, you should know everything in here already).  Of course, the accuracy of this method is dependent on the accuracy of your GPS data.

If most of your data points correspond to the location of the cell tower (meaning you have several thousand meter uncertainties), it probably does not do much good to try and differentiate between your garage and bedroom.

In addition to simply tracking how much time is spent at different places, you can track time spent traveling between locations.  This is easy to do using the Haversine distance between adjacent points and the time interval between logged points.

Setting a threshold of speed then allows you to tag a time interval as "traveling."  The threshold should be set high enough that it is not affected by the scatter between temporally adjacent GPS entries where you are stationary, but low enough that you do not need to be going at highway speeds.

Visualization

With some relatively simple Python scripting, you have analyzed your database of GPS coordinates and produced tabulated data on how often you frequent specific locations or how much time you spend traveling.  Most people do not enjoy staring at tables of numbers, so you should think of ways to visualize.

One easy method is to make histograms.

For a set time period (maybe one week?), compute the amount of time you spent at home, at work, and traveling.  Do this for multiple weeks, then a row-stacked histogram can quickly show you coarse trends with time.

Also, if you do not fear giving Google GPS coordinates of your travel destinations, use the Google Maps Static API to generate and download maps of your travels.  Multiple latitude/longitude pairs can be submitted through an HTTP request which will generate a PNG image showing the locations. Other options are available; see the API website for more information.4

That certainly is not a complete set of visualization options, but should at least give you a head start...

Practical Issues

The major practical issue is how often do you record your location?

In my experience, this is a balance between desired temporal resolution and battery life of your GPS device.  I have experimented with 10, 15, and 20 minute intervals.  The 10 minute intervals seemed to drain the battery too quickly over the course of the day (especially if GPS fixes were difficult to attain), while I desired slightly better resolution than three points per hour.

Experiment with values depending on your mobility, typical positioning accuracy, and battery life.

Data storage requirements are minimal for only a few points per hour (you should only accumulate a few megabytes a year of raw plus analyzed data.  Processing load is also mild for the aspects discussed above, but will obviously increase with more complex data mining.

Privacy/Security Concerns

Although your location information is available to your cell phone provider (and certainly your friendly government), it makes sense to provide some security for your database of GPS coordinates.

This will inhibit tampering and deletion of your repository of geo-location information.  Password-protected databases on encrypted hard drives is a good start.

Air-gapping the repository from the broader Internet is even better (unless you want to submit GPS data as it is recorded by your phone).

There is no sense in handing over a log of your location of the past X months without a fight...

References/Footnotes

  1. www.webos-internals.org/wiki/Patch_webOS_GPS_Tracking
  2. I find the assisted-GPS on the webOS phone rarely provides (accurate) altitude information so it may or may not be useful to factor this into the analysis.
  3. en.wikipedia.org/wiki/Haversine_formula
  4. code.google.com/apis/maps/documentation/staticmaps
Return to $2600 Index