USAID Invites World’s Citizens to Contribute Data
The United States Agency for International Development’s (USAID) GeoCenter had a dataset of approximately 117,000 records that it wanted to make open and available to the public. The dataset described the locations of loans made by private banks in developing nations.
USAID hoped that by giving the public this granular geographic data about the loans, civic hackers and everyday citizens would compare the data with other datasets USAID had available and “make significant and creative contributions to how USAID does business.”
The data, though, was not ready to be published. Not only was it piled together rather than organized into fields, it had not been entered in a consistent format. For example, spelling of place names, use of commas, and abbreviations all varied.
While the majority of the geodata could be cleaned up with savvy software, about 10,000 locations required human processing—and that’s when USAID considered crowdsourcing.
You can hear about USAID’s process and incredible success from principal GIS analyst Shadrock Roberts.
Calling Upon the Crowd
The GeoCenter studied possible regulatory hurdles, tested a crowdsourcing approach, and decided to go with it. Socrata created an app that allowed for the use of Data.gov as a platform for tabular data editing and generation.
With the app, “crowd members” checked out 10 data entries at a time. The app kept track of which member was which, when they took the data, and other key stats. Socrata developer evangelist Chris Metcalf built the app.
“The geodata literally said things like, ‘It’s the place behind the chicken shack.’ So, they got a bunch of people who understood the geography of the cities to parse the details in the data set,” says Metcalf. “You might not have the right address of the place behind the chicken shack but at least you know which neighborhood it is in. Which, in developing countries, is actually really good because we’re not talking about street names we’re talking about dirt paths and stuff.”
Though the app was almost overwhelmed at times during the 16 hour sprint to complete the project, it did allow the much-needed data cleanup to occur.
“[The app] got swamped with traffic. There were so many users on the app that it overwhelmed the Heroku application that was running the service. I added additional capacity to the Heroku application to keep things working,” says Metcalf.
Metcalf considers the project a great success because it allowed citizens to help USAID make use of data they “knew very little about before.”
In its summary of the project, USAID shared hope for similar successes in the future: “The project has the potential to encourage more agencies to publish more data in a cost free manner and engage an interested and experienced public directly in U.S. Government work. This “data as dialogue” has transformative power not only for data processing, but also building greater awareness of USAID’s mission, goals, and work.”
You can read USAID’s detailed description of their process here: “Crowdsourcing to Geocode Development Credit Authority Data: A Case Study“
Want to read more stories of open data innovation?
Like Socrata on Facebook.
Follow Socrata on Twitter.
Check out Socrata on Pinterest.