Skip to content
This repository has been archived by the owner on Oct 17, 2020. It is now read-only.

The Data Federation Playbook

Julia Lindpaintner edited this page Mar 8, 2019 · 2 revisions

The following was originally published as part of the Data Federation Framework from Phase 1 of this project.

The Data Federation Playbook

The following nine plays are drawn from our interviews with successful federated data efforts. They are intended to provide highlights of what project teams recall as being essential to the success of their project. It's unlikely that any effort will be able to execute on all plays, but if you are undertaking a new federated data effort, it will greatly improve your chance of success if you have a few of these plays in motion.

Policy Should Be Focused on Processes and Outcomes, Not Implementation

The details of implementing an open data standard must be able to be adapted by project teams on the fly to meet the needs of the owner and user communities. Don't specify technical details in policy or law, instead focus on goals and processes used to achieve those goals (e.g., developed iteratively with user feedback, or must develop a machine-readable data standard).

Identify Use Cases With Demonstrated Demand

When prioritizing whether or not to invest in a federated data effort, seek out ways to prove demand. Are citizens or media frequently requesting a certain type of data? Is there already a community built around fixing or cleaning up a certain type of data? Talk with your call centers and local communities. For example, with the Voting Information Project, Google reached out to the public sector because they had data indicating that people were searching for basic polling place and ballot data, but not finding it.

Develop a Killer App

If your hypothesis is that your data is useful, do the work to build out the first use case and demonstrate that value. This will help win over the hearts and minds of participants. If you can't think of a single use case for your data, it might make sense to do more outreach with relevant communities to refine the value proposition before undertaking a larger effort. For example, the General Transit Feed Specification has become widely and enthusiastically adopted since it allows an agency's transit data to be displayed in Google Maps.

Allocate Proper Resources

Throwing money at data owners without any structure will likely not get great results, but adding financial support for training, obtaining new software, and high-touch onboarding is very helpful. For example, data.gov.ie has an "open data unit" of 3-4 people whose sole purpose is to provide high-touch assistance to departments who want to provide data but need help navigating technical or procedural issues. And when the state of Connecticut wanted to normalize the financial data of its municipalities, it provided grants for obtaining new software and assisted with training, which built up goodwill and was of significant practical importance.

Implementation Should be Driven by a Single Empowered Team

It's important to be able to deliver iterations on the application and specification quickly in order to build trust and momentum. Do not attempt to split responsibilities (e.g., the specification maintained by one department, and the application by another).

When possible, deliver value directly to data owners

The teams responsible for complying with the policy bear the vast majority of the burden, yet often receive little benefit. The long term success of these efforts are often contingent on delivering value directly to these data owners. For example, in the state of Connecticut, municipalities readily submit their data to the open data portal, since it is the easiest way to publish and share data from one department to another, which is an operational necessity. And for the Philadelphia Open Data Portal, newly published data gets publicity in the form of a blog post, and gets visualizations layered on top of it. Government employees are very motivated by public service, and will be delighted to comply if they see the public engaging fruitfully with the data.

Start With Simple Technologies

Just because a project might have tremendous value, or be complex organizationally, doesn't mean it needs to be complex at the technical level. For the DATA Act, for example, they found CSVs were the easiest format for owners to comply with, and also very easy to ingest and validate. The system behind the DATA Act, which was built on time for 1/10th of the CBO cost estimate, is architecturally just a simple web application.

Nurture Early Adopters

Identify 2-3 early adopters, and work side by side with them to comply with a draft of the specification. Their success is critical to adoption — they will become much more convincing evangelists of your effort than you can be. They're also critical from a practical standpoint to allow for proper feedback and generalization, provide a forum to discuss challenges, and demonstrate technical feasibility.

Support Compliance Tooling

In order to lessen the burden on data owners, it's important to do everything you can to help them compile and validate their data. For example, data.gov provides inventory.data.gov, a metadata inventory tool for agencies that easily exports the metadata to the required format. It also provides an online tool for validating a data.json file adheres to the specified format. Publicly accessible, human readable documentation is also a critical part of the success of these efforts.