Making the Two-Way Street of Open Data a Reality
By Chris Whong, Socrata Data Solutions Architect
Open data has by and large been a one-way conversation. Governments produce public data and make it freely available, while citizens, journalists, researchers and hackers consume it in whatever ways suit them. But, having more eyes on the data once it is released may be able to provide value back to the government, turning users of the data into a source of new data and quality control. This is the experiment in two-way open data that New York City is pioneering with OpenStreetMap.
OpenStreetMap (OSM) is the “Wikipedia of Maps”, where anyone can contribute changes. (Yes, if there’s a footpath or bike trail near your house that doesn’t show up on mainstream web maps, you can literally “draw” it into OSM, name it, and connect it to existing roads.) Like Wikipedia, changes to the map are subject to quality control by the rest of the community, and can be just as easily undone. Users may choose to update the map for many reasons, from just knowing more about conditions on the ground than anyone else, to improving the map for a specific project such as an app.
What if a user needed some building outlines that OSM didn’t have yet? That user could manually trace over the satellite imagery, pointing and clicking lots of custom polygons into existence. But what if they needed a whole town? What if they needed New York City? They do, and NYC has an open dataset for that. The city’s detailed GIS database of building outlines and point data is freely available for download at data.cityofnewyork.us. While it will still take human effort to import and verify data for over a million buildings, creating them manually would be an unfathomable and time-consuming process. Leaders in the mapping tech community have partnered with NYC’s Department of Information Technology and Telecommunications (provider of the footprints data) and are launching a community project to systematically import the city’s treasure trove of building data into OSM. The project was announced in late September here.
Legions of Geospatial Analysts
So what’s in it for the city? The potential for updates that only a system like OSM can provide. If OSM users see something wrong, they can fix it. Maybe a building footprint is misaligned, or maybe the building doesn’t exist anymore. The city receives a daily update of changes to the building data, and can review those changes. If they are legitimate, DoITT can apply those changes in their own master database, making it more accurate and up to date. It’s as if the city has legions of geospatial analysts quality-checking their data and sending updates! Alex Barth, Data Lead at the web mapping company MapBox and OSM Advocate, has been a key organizer of the NYC – OSM collaboration, and has been working on the idea since early 2012. The data was already publicly available back then, but carried a license that was incompatible with OSM. NYC’s Open Data Law, passed in March 2012, cleared up the licensing issue and provided the way forward.
To Barth, the project is not simply about buildings, but is an experiment and learning experience about the impact of community-driven projects like OSM. “It’s a data improvement effort that has positive side effects and really lets us grow the community.” The longer-term vision goes beyond OSM or even geodata, and hopes to redefine open data publishing: “This is about an open data commons, a single space in which government and citizens interact.”
The first gathering of volunteer mappers to work through the monumental task of importing the city’s data met in October. Liz Barry, another leader in the NYC-OSM collaboration, hosted the meeting at the offices of the Public Lab in Brooklyn, and 22 community members showed up to help. The data was broken down into election districts, and the team set out validating footprints against aerial imagery, checking geometries, and correcting overlapping polygons. Existing attribute data in OSM could also be merged with better polygons from the city data. Barry said the workflow is still being vetted, and is not quite ready for full-scale deployment. The idea is that once the workflow is perfected, updates won’t require a physical meetup. Volunteer OSM users will be able to import a chunk of the city’s building footprints whenever and wherever they can.
The real fun will begin when large amounts of the data have been successfully imported and the city can report back about the volume and utility of OSM-contributed changes. In many cases, there may be more information about a building in OSM than the city maintains on its own, meaning the “two-way street” of open data may not flow evenly in both directions. The OSM community has found a partner in DoITT, and this experiment will serve as an early model of the power of citizens and activists to improve government data.