Metadata Basics and Advice for Open Government Leaders

October 24, 2017 11:57 am PST | Data as a Service

How do citizens and staff find the data you publish? Whether they’re on the web or using an internal data source, they’re likely going to search for exactly what they want. And, your metadata has a big impact on whether or not they find what they’re looking for.

Metadata is the data you publish about your data. It describes its qualities — topic matter, last update, source, and much more. The following is an overview of metadata basics as well as the story of one open data leader who is using metadata to inspire more citizens to get involved in civic issues.


Metadata Basics from Socrata

The Socrata Team’s “Best Practices for Metadata Management” divides metadata into four common types:

  1. Administrative Metadata – most common and is produced in data collection, production, publication, and archiving.
  2. Structural Metadata – describes a dataset’s structure, including its format, organization, and variable definitions. This is highest in demand by researchers and academics.
  3. Reference/Descriptive Metadata – a broad term that mostly involves descriptions of methodology, sampling, and quality.
  4. Behavioral Metadata – records the reactions and behaviors of the dataset’s users such as a rating or user analytics.

Data publishers can use a metadata schema to help organize the list of attributes they’re sharing with users. The following are suggested parts of a metadata schema you could include, according to Socrata’s team:

General Information

  • Dataset Title
  • Brief Description
  • Category
  • Tags / Keywords
  • Row Label

Licensing & Attribution

  • License Type
  • Data Provided By
  • Source Link

Semantics & Resource Description Framework (RDF)

  • Row Class
  • Subject Column

API Endpoint

  • Resource Name
  • Row Identifier

Thumbnail Image

  • Upload Image

Contact Information

  • Contact Email

Socrata also offers a “Sample Metadata Schema” with more information.


Metadata Feeds the Cambridge Civic Innovation Challenge Inventory

Metadata can do more than just describe your data. One open data leader is using it to inspire new levels of citizen engagement.

Josh Wolff runs open data for the city of Cambridge, Massachusetts. He has a form that every data owner fills out when submitting data to be published on the Cambridge Open Data Portal. The form requests information in three areas:

  • Dataset logistics and sensitivity review
  • General information about the dataset
  • Information about the dataset’s fields

Included in the form are typical asks, like expected frequency of updates to the data, keywords to make it easy for users to discover, and a descriptive title. Wolff, though, recently added another field to the his submission form that is now automatically feeding a whole new initiative for the city.

Informed by the work of a summer research fellow named Jennifer Angarita from the Harvard Kennedy School’s Ash Center for Democratic Governance and Innovation, Wolff wanted to offer citizens, the media, and researchers “problem statements” for each dataset that could inspire solutions to city issues. He simply added a section to his submission form that asks, “What problem could users help solve using this dataset?”


“I get a call once a month or so from someone saying, ‘I’m interested in this problem could you go into more detail?’ I wouldn’t get those calls before we unveiled this program.” —Josh Wolff, Open Data Program Manager for the city of Cambridge


The responses to this question are now fed automatically into the Cambridge Innovation Challenge Inventory. Challenges include using parking ticket data to improve parking availability and studying permitting data to see development trends in Cambridge. Already, the inventory has inspired one citizen to create a “Permits Dashboard” based on the Board of Zoning Appeals Requests dataset.

Wolff is excited about how simple the inventory was to set up and sees it making a difference already. He says, “I get a call once a month or so from someone saying, ‘I’m interested in this problem could you go into more detail?’ I wouldn’t get those calls before we unveiled this program.”


Metadata Advice from Wolff

We asked Wolff to share some best practices on how to produce and maintain metadata for your open data program. 

  1. Consider not only how metadata might be used by the public, but how it might help you achieve your program’s goals. The addition of a problem statement field to our metadata forms helped us engage the public. Likewise, we use the “estimated update frequency” and “last updated” metadata fields to help us identify when datasets need updating.
  1. Give more rather than less information. Don’t be shy about adding new metadata fields to Socrata’s template.
  1. Use language people might use in a search. Follow standard language for metadata fields. Make your data more research-friendly by offering structural metadata. 
  1. Have data owners either produce or review as much of the metadata as possible. Make it easy for data owners to contribute metadata. In Cambridge, we supply data owners with metadata worksheets and then transcribe their responses to Socrata.


Learn more about Cambridge’s creation of a Civic Innovation Challenge Inventory.

Want to improve your use of metadata and how citizens engage with the information you publish? Contact Socrata.

Previous Article
Data as a Service
3 Keys to Effective Learning and Development Programs

October 27, 2017

Next ArticleRay and Maria Stata Center - MIT
Data as a Service
How Cambridge Created a Civic Innovation Challenge Inventory

October 20, 2017