As many jumped into making recommendations on how the U.S. government’s Recovery data should be packaged and disseminated, it’s worth looking into some important previous work in this area, work with which many who are new to open government might be unfamiliar.
The first is the ACM U.S. Public Policy Committee (USACM) Recommendations on Open Government. The ACM is known as “the world’s largest educational and scientific computing society”. The ACM U.S. Public Policy Committee (USACM) “serves as the focal point for ACM’s interaction with U.S. government organizations, the computing community, and the U.S. public in all matters of U.S. public policy related to information technology.” The policy statement on “open government” first sets the context for its recommendations:
Individual citizens, companies and organizations have begun to use computers to analyze government data, often creating and sharing tools that allow others to perform their own analyses. This process can be enhanced by government policies that promote data reusability, which often can be achieved through modest technical measures. But today, various parts of governments at all levels have differing and sometimes detrimental policies toward promoting a vibrant landscape of third-party web sites and tools that can enhance the usefulness of government data.
The recommendations “for data that is already considered public information” are:
- Data published by the government should be in formats and approaches that promote analysis and reuse of that data.
- Data republished by the government that has been received or stored in a machine-readable format (such as online regulatory filings) should preserve the machine-readability of that data.
- Information should be posted so as to also be accessible to citizens with limitations and disabilities.
- Citizens should be able to download complete datasets of regulatory, legislative or other information, or appropriately chosen subsets of that information, when it is published by government.
- Citizens should be able to directly access government-published datasets using standard methods such as queries via an API (Application Programming Interface).
- Government bodies publishing data online should always seek to publish using data formats that do not include executable content.
- Published content should be digitally signed or include attestation of publication/creation date, authenticity, and integrity.
The second is a set of Open Government Data Principles formulated in October 2007 by the Open Government Working Group, “30 open government advocates gathered to develop a set of principles of open government data”:
Government data shall be considered open if they are made public in a way that complies with the principles below:
- 1. Complete
- All public data are made available. Public data are data that are not subject to valid privacy, security or privilege limitations.
- 2. Primary
- Data are collected at the source, with the finest possible level of granularity, not in aggregate or modified forms.
- 3. Timely
- Data are made available as quickly as necessary to preserve the value of the data.
- 4. Accessible
- Data are available to the widest range of users for the widest range of purposes.
- 5. Machine processable
- Data are reasonably structured to allow automated processing.
- 6. Non-discriminatory
- Data are available to anyone, with no requirement of registration.
- 7. Non-proprietary
- Data are available in a format over which no entity has exclusive control.
- 8. License-free
- Data are not subject to any copyright, patent, trademark or trade secret regulation. Reasonable privacy, security and privilege restrictions may be allowed.
Compliance must be reviewable.
The final is the paper “Government Data and the Invisible Hand.” (Yale Journal of Law & Technology 11: 160.) by David Robinson, Harlan Yu, and Edward Felten. The abstract contains the following recommendation:
Today, government bodies consider their own websites to be a higher priority than technical infrastructures that open up their data for others to use….It would be preferable for government to understand providing reusable data, rather than providing websites, as the core of its online publishing responsibility.
On ProgrammableWeb last year we distilled the paper’s argument as follows:
The conclusion is based on a claim that the executive branch is comparatively ineffective at creating tools for presenting data and should therefore leave that work to a private sector (either nonprofit or commercial entities) that is best able to respond to a wide variety of possible uses for government data. That doesn’t mean that the government should provide no user interface to the data, but rather “should focus on creating a simple, reliable and publicly accessible infrastructure that exposes the underlying data.” Fancier interfaces and tools should be built by others.
Moreover, the authors have recommended a specific mechanism for ensuring that the government does not privilege any user interface over their public data infrastructure: “require that federal websites themselves use the same open systems for accessing the underlying data as they make available to the public at large.”
The National Dialogue closed on May 3. Submissions included my proposal that the ACM U.S. Public Policy Committee (USACM) Recommendations on Open Government be adopted, as well as some focus on open data in the discussion around Tim Berners-Lee’s proposal for linked open data (which rated 4 out of 5 stars with 62 votes and 42 comments, making it one of the most visible proposals).
BTW, those seeking a thoughtful and technically solid analysis of this dialogue should read Greg Elin’s Reviewing the National Dialogue on Recovery IT.