(Ab)using Socrata for Personal Fun and Fitness with Jawbone
One of the many perks of being a Socrata employee is having not just great insight into the power of data and technology, but also personal access to a platform purpose-built to make that power quick and easy to wield. So, while developing the launch version of GovStat last year, we of course constantly joked and mused about tracking goals around everything from commute times to whiskeys consumed. It was only a matter of time before one of us actually dropped some personal data into the engine to see what would emerge. I volunteered as tribute for this exercise.
The Fitness Band
I’ve been wearing a Jawbone Up for the better part of a year now — it tracks my daily activity in the form of steps per day, and my daily sleep in the form of not enough minutes per night. The mobile app provides a pretty great visualization of your activity on a daily or weekly level, and lets you set daily goals for steps walked and hours slept. However, it doesn’t really provide a longer term view of your historic data, and you can’t set farther reaching goals. So, one day after a fitness-band-guilt-induced long walk home, I decided to drop the data into GovStat.
One would imagine that working with a hip startup’s data service would be significantly easier than rescuing data from decades-old database systems living on machines people have long since forgotten about but as it would turn out, I was about to embark upon a journey of Rube Goldbergian lengths.
The Data Source “API”
Things have changed now, and Jawbone has a much more complete API available, but when I began this adventure last summer, the only way to retrieve your data from Jawbone’s servers was to rely on one of their built-in API partners. Of these, most were fitness or nutrition tracking services which themselves wouldn’t let me get the data easily. Thankfully, the ever-reliable IFTTT was available to perform its usual task of gluing the Internet together.
Of course, I still needed a way of getting the data to a place from which I could easily automate the upload to Socrata.
Thus was born the Jawbone-to-Dropbox IFTTT recipe. It takes the relevant data from Jawbone, formats them as if they were a line of a CSV file, and drops them in a file in my Dropbox account. Now that I’d built some complicated machinery to turn Jawbone fitness data into a CSV file on my own computer, the next step was to write a script to upload that data to our platform.
We support a lot of different data upload methods, and we have a variety of pre-written libraries to serve this purpose, probably in your favourite programming language. In this case, I opted to use the SODA-JS library that I wrote while attending a hackathon held by Alameda County. The library supports ingesting data into Socrata datasets via upsert, which is perfect for this case as I can consider each day to be unique and simply upload each row based on the day it belongs to.
Now that we have all of our parts chosen and ready, it’s time to assemble our data-updating robot (lovingly dubbed Upbot).
The first step is to read in the CSV file and parse it back into fields so that I can perform some ETL on the values. There are a number of great NPM packages available that perform this work, but in this case as I was “generating” the data myself via the IFTTT recipe, I knew that the data would be well-behaved, so I simply dropped in a regex-based solution I happily copied off of the Internet. That done, all I had to do was load the file, verify that each row looked sane, and drop it in a local array.
Now that we have the data loaded programmatically, it’s time to manipulate it just a bit to fit our platform. At Socrata, we value quality engineering even when writing random side-scripts, so I’ve set up the robot’s ETL operations to actually accept configuration via a local file, so that I can repurpose it for different uploads I might need or want in the future.
Lastly, it’s time to actually upload the data to our platform. It’s kind of a breath of familiar relief to see that after all this hard work, the part that relates to Socrata is the dead simplest:
If I didn’t care about logging status, I could even forget about the latter two lines.
Now, I finally had an automatic data import process. All it took was the combined effort of five separate services: Jawbone, IFTTT, Dropbox, my own private server, and Socrata. I set up a cronjob to run every evening, and watched the data pour in each day. You can find the robot in its entirely here.
Of course, the original point of this endeavour was to form long-term goals about my fitness on GovStat. So, I created ClintStat and pointed my data robot at it. One short jaunt with the goal creator later, I had my shiny new GovStat goal.
A GovStat goal is comprised of a couple simple elements: a numeric value that changes over time, representing the latest atomic measurement of your goal, and a time value that signifies when that measurement was made. In this case, the number of steps I walked per day was the measurement, and I chose midnight of each day as that day’s time value. My goal in this case took the average number of steps I walked over the entire time period I was tracking, and sought to bring that average up to 12,000 steps a day.
This was great fun! I’d managed to coax data out of an unwilling system by sending it on an odyssey around the Internet, and now I was seeing a piece of my life laid out across the screen in front of me such that I could quantifiably track and improve it. So of course, something has to go wrong.
The Empire Strikes Back
Jawbone changed their output format. This happened in late September — right where the data in the above screenshot peters out. Specifically, they changed the way their dates output — maddeningly enough, into a less machine-readable format. Moreover, in a twist of humour only time-related programming could invoke, my date parser was now interpreting the day of month as the year, meaning that suddenly I was literally walking all over the 20th century.
As fun as time travel is, this is why at Socrata we never change our API in an impactful fashion without warning developers first. You can find this information on our API changelog, and we always warn of possibly breaking changes coming up via our support site. If you haven’t already, bookmark those resources now!
Upon updating our ETL script we get our beautiful data back, and you can see me triumphantly meet my fitness goal for the year:
Obviously, the winter months were not conducive to my wanting to walk to and from work.
The Poignant Conclusion
Failed personal goals aside, this was a great look at how even faced with the most difficult data ingest problems, it’s possible to get live data into our platform. We learned that sometimes data and data workflows are just hard, regardless of what the source is: it puts our transparency data struggles in perspective. It’s also a reminder that data can be used not just for civic good, but also for silly fun and personal betterment.
If you have any questions about this post or using our API, please don’t hesitate to drop me a note on Twitter!