For more than 150 years, The New York Times has meticulously indexed its archives, giving it one of the most authoritative news vocabularies ever developed. Now this archive has been linked to the open knowledge bases DBPedia and Freebase.
Both DBPedia and Freebase use the internet community, along with existing information sources like Wikipedia, to store structured data on a huge range of subjects. As the DBpedia Wiki explains:
The DBpedia knowledge base currently describes more than 2.9 million things, including at least 282,000 persons, 339,000 places (including 241,000 populated places), 88,000 music albums, 44,000 films, 15,000 video games, 119,000 organizations (including 20,000 companies and 29,000 educational institutions), 130,000 species and 4400 diseases.
This structured data allows you to answer questions like “give me all cities in New Jersey with more than 10,000 inhabitants” or “find people born between January 1st 1980 and September 18th 2000″. By mapping the names of individuals indexed in The New York Times archive to the information already present in these open knowledge bases, it becomes possible to place these individuals in a much broader context. The New York Times blog explains how this data can be used:
So now you can visit http://data.nytimes.com/N66220017142656459133 and see that our “Colbert, Stephen” is equivalent to DBPedia’s http://dbpedia.org/resource/Stephen_Colbert and Freebase’s http://rdf.freebase.com/rdf/en.stephen_colbert. Even more importantly, your computer can visit http://data.nytimes.com/N66220017142656459133.rdf and get all of this information in a computer-readable Resource Description Framework (RDF) document. Just for fun, we also threw in some other tidbits, such as the first and last date that “Colbert, Stephen” was mentioned (2002-12-10 and 2009-08-26) and the number of articles about this subject (46). To make it easy to retrieve all the latest headlines tagged “Colbert, Stephen,” we’ve included the NYT Article Search API query for you to use (first you’ll need to pick up an API key from our Developer Network).
Over the next few months The New York Times plans to link each of the nearly 30,000 subject headings in their archive, which will include locations, organizations and descriptors in addition to person names.
For more in the Times open platform efforts, you can use our directory to find more information on 10 different New York Times APIs.