Open Library API: Cataloging 13 Million Books

Raymond Yee, May 16th, 2008

The Open Library is a project of the non-profit Internet Archive, whose long-term goal is to present “one web page for every book ever published.” A recent release of the Open Library brought the total number of book records to over 13.4 million, including over 234,000 records with full-text for the book. A new public Open Library API was also announced to give read access to the Open Library (and note that the addition of the Open Library API profile here means there are now 14 different book-oriented APIs on ProgrammableWeb).

Consider one of my favorite books on the Python programming language: David Beazley’s Python Essential Reference, 3rd Edition. You can find the Open Library record for the book at

http://www.openlibrary.org/b/OL7668717M

On that page, you will see metadata about the book (e.g., title, author, language, the ISBN, etc.), as well as links to booksellers and libraries that might be able to sell you a copy of the book. Something that might be surprising is that you (and anyone else) can edit the record and see the history of revisions to the record. Think of Open Library as a big book-oriented wiki.

Using this book as an example, let’s look at how to apply the three parts of the Open Library API:

  • get (to get an object)
  • things (to query for objects)
  • versions (to look for versions of objects)
  • Note that a good place to try out the API is the Open Library API Sandbox, which allows you to issue a query to the API and see the response.

    First, you can issue the following get query:

    curl http://www.openlibrary.org/api/get?key=/b/OL7668717M

    to get a JSON object that holds metadata about the book. In fact, you can get the JSON response to show up nicely in the browser by attaching the following parameters to the URL

    &prettyprint=true&text=true

    to pretty-print the response and send it as plain text

    
    

    Second, let’s figure out how to get the Open Library identifier for this book (which you needed for the get query) using a things query. If you have an ISBN-10 for the book (i.e., 0672328623) , you can use direct a query whose value is

    {"type":"\/type\/edition", "isbn_10":"0672328623"}

    to generate the corresponding response.

    You can use a things query to search for books by title. Here you can specify a wildcard (*) at the end of a field, whose name you need to mark with ~

    {"type":"\/type\/edition",
     "title~":"Python Essential Reference*"}

    to generate a response containing the corresponding Open Library identifier.

    Third, we can get at the different versions of a record in Open Library by performing a versions query whose value is

    {"key": "\/b\/OL7668717M",
     "sort":"-created", "limit":10}

    to generate a JSON object holding version data for the record.

    There’s plenty more to explore in the documentation of the API, including the list of types supported in the API.

    It’ll be interesting to see whether it will get a following primarily in the library community or in the larger world. The announcement of the API on the code4lib list (a “forum for discussion of computer programming in the area of libraries and information science”) immediately prompted the question of why the API was not implemented using SRU (a protocol used, for example, at the Library of Congress). The ranges of responses in the thread (SRU is “incomprehensible to non-librarians” to “non librarian students look at the [SRU] document and start working with it straight away.“) should be familiar to anyone who has struggled with the questions of whether to adopt an existing standard or protocol or to create one’s own. How could concepts that are crystal clear to one group appear so obscure to another group? Who exactly is the audience for a given API? How do you accommodate multiple audiences in the design of an API?

    Both comments and pings are currently closed.

    4 Responses to “Open Library API: Cataloging 13 Million Books”

    May 19th, 2008
    at 11:49 am
    Comment by: Vineet Gupta

    This is a perfect match for a design we have to mash a website that presents books, reviews and news about them together – http://cookbook.daylife.com/booksandreviews

    May 19th, 2008
    at 11:09 pm
    Comment by: John Musser

    @Vineet: Looks like a great idea.

    May 21st, 2008
    at 11:45 am
    Comment by: Mashup Guide :: Notelets for 2008.05.20

    [...] OUseful Info: An OpenLibrary API Handshake With Yahoo Pipes cf my PW post (Open Library API: Cataloging 13 Million Books) [...]

    May 23rd, 2008
    at 6:53 am
    Comment by: Michael Nielsen » Biweekly links for 05/23/2008

    [...] Open Library API: Cataloging 13 Million Books [...]

    Follow the PW team on Twitter

    ProgrammableWeb
    APIs, mashups and code. Because the world's your programmable oyster.

    John Musser
    Founder, ProgrammableWeb

    Adam DuVander
    Executive Editor, ProgrammableWeb. Author, Map Scripting 101. Lover, APIs.