Power To The People: How The Public Can Improve Government’s Data Collection Methods
So you’ve got open data, but how open are your (or your government’s) data collection methods?
Curiously, this is a question an astro-particle physicist and his hackathon teammates* (called EcoSleuth) ended up wrestling with at the environmentally-themed DataBay Challenge last month, which was organized by Maryland Governor Martin O’Malley and several partners to involve the state’s citizens in developing creative solutions to issues facing the health of the Chesapeake Bay.
At first glance, it doesn’t seem like astro-particle physicist, Brian Baughman would have much to say about Chesapeake Bay data collection methodology. However, his job at the University of Maryland has him conducting large scale data analyses on a daily basis and he was intrigued by the prospect of using his big data skills to improve the Chesapeake.
But, once he got to the DataBay event, Baughman’s idea about what he could contribute changed. “I looked at the data sources and realized we couldn’t build any real analysis in a weekend, since the data sets were dispersed. So, I thought creating an open, public database would be a great way to support my interest in collecting and analyzing big data, but having it all gathered into a single, accessible database,” said Baughman in a phone interview with Socrata.
Besides the dispersed data sets, the other issue the EcoSleuth team noticed was the competing needs of numerous government agencies. Early in the hackathon, the Maryland Department of Natural Resources (DNR) pitched for an angler’s log, which would allow fisherman to relay information about invasive species they came across in the Chesapeake. At the same time, the Environmental Protection Agency (EPA) wanted to crowdsource reports of algae blooms by color and water.
EcoSleuth saw an opportunity to help out both agencies (and potentially many other government bureaus) by building a framework to crowdsource environmental data, collecting it not just fishermen or people who live by the water, but from any citizen scientist.
“The thing we realized,” Baughman said speaking of conversations his hackathon team had, “Is that it didn’t have to be so specific, although that’s what each agency was asking for. Instead, we could build a framework of apps to collect data for DNR and EPA or really any agency. So we decided to not just build the initial concept of a single, crowdsourced database, but one that could be skinned with any template specific to the data the agency was collecting.”
As a proof of concept, Justin Leishman, the team’s developer, started building EcoSleuth in iOS with the following functions:
- take a picture (as a photograph allows for scientists to do more in-depth analysis through the database’s backend)
- record geolocation data
- permit basic data collection questions (such as “What color is the algae?”)
Meanwhile, EcoSleuth team members Drew Hart and Varun Manjunatha, both grad students at University of Maryland, were working to make sure that all the data the app collected would be uploaded to Socrata. The EcoSleuth team reasoned that if all the crowdsourced data was available in Socrata, the data would be:
- open by default
- accessible for everyone (including government analysts, researchers, advocacy groups, and citizen scientists)
- easily graphed and mapped using existing Socrata tools
As presented, the EcoSleuth app took second prize at DataBay. Governor O’Malley was so impressed with the team’s work that he invited them to a Bay Cabinet meeting, where they again demonstrated the app and its potential.
Recognizing the value a crowdsourced data app could have on their policy decisions, the Cabinet established a working group to help get the project off the ground. “The conversation [with the Cabinet] has been really productive,” Baughman reported, “When we argued that scientists would want all this data to be completely open, with no responsibilities or identifying information for users, they agreed.”
Once a few tweaks have been worked out, EcoSleuth plans to make the first version of their app (likely called Algae Alert) available in the App Store. The team decided to focus on algae in light of recent blooms in the Great Lakes. Once the Algae Bloom app is up and running, the team hopes others will fork it to help the data collection of other government agencies and advocacy groups.
In the meantime, EcoSleuth is looking for Android developers and funding to keep the original team involved. (See below, they all have day jobs.) Baughman says, “We really want to bring this to a useful stage, post-hackathon, and we hope that Maryland and other governments will consider how to help take these kind of products to market. Then we’d see that hackathons and open data can truly add value, utility, sustainability, and growth.”
- Brian Baughman, astro-particle physicist, University of Maryland
- Drew Hart – bio grad student, University of Maryland
- Justin Leishman – iOS developer
- Varun Manjunatha – machine learning grad student, University of Maryland
- Laura Mcintyre – Chesapeake Bay enthusiast
- Teresa Wong – regulatory and compliance engineer, Maryland Department of the Environment