scrAPIs

John Musser, March 21st, 2006

scrAPIWhat’s a scrAPI? A scrAPI, which at this point is more of an idea than a thing, was recently described by Thor Muller in his blog as a type of community-built API that provides a programming layer above web sites that don’t otherwise have an API. This intermediate layer, which exists independently of the destination web site, in turn does the dirty work of screen-scraping of raw HTML from the source and returns just the relevant data in some cleaner XML format. Thus a collaboratively built and maintained set of code for data access from any source.

It’s an interesting idea. Many complications of course. Not the least of which is that many companies object to scraping, be it for reasons of load, stability, or copyright. Good example being Craigslist vs. Oodle.

In this follow-up post Thor notes that the original coiner of the term was Paul Bausch back in 2002. Which in turn was in reference to scraping Amazon data. And interestingly, it was just this sort of scraping that was a key driver in leading Amazon to subsequently build a real API: people are going to do it anyway, let’s formalize and leverage it.

Tags: Issues, Law
Both comments and pings are currently closed.

8 Responses to “scrAPIs”

March 22nd, 2006
at 3:44 am
Comment by: Thor Muller

John,
great point about the potential complications regarding scraping. I have some new essays in the works that address the business and the legal issues around these. There are a tremendous number of data sources that are effectively in the public domain, and a solid framework of best practices would help minimize the potential downsides around load and stability. With community involvement, scrAPIs could be instrumental in supporting equitable terms of use.

While there are plenty of data providers that will want to guard their siloed data, there are many more who don’t care or want to make it available in broader form. If we treat the data and its providers with respect, then we can help free it while preserving goodwill.

I would suggest that we respect the wishes of data providers that don’t want to open up their systems. No reason to fight those battles yet.

There is an exception–when the data is public domain but relentlessly siloed by government agencies. In these cases I think we have the right–perhaps duty–to free it for the benefit of all.

We just may see instances of civil disobedience via scrAPIs before long.

March 22nd, 2006
at 11:03 am
Comment by: John Musser

Civil disobedience via APIs — now there’s a concept!

March 22nd, 2006
at 7:55 pm
Comment by: elephantwing.com » Blog Archive » misc links (03-22-06)

[...] scrAPI’s – replacing individually-maintained screen scrapers with community-maintained screen scrapers and building API’s around them. Another interesting concept to keep an eye on, including the possibility of increased legal issues [...]

March 24th, 2006
at 3:22 pm
Comment by: The hottest girl with the dumbest name - Brokekid.net

[...] scrAPIs is not an std. [...]

March 25th, 2006
at 12:38 pm
Comment by: Richard K Miller dot coooooooooom » scrAPIs

[...] Sources: ThorMuller.com ProgrammableWeb.com [...]

March 26th, 2006
at 5:05 am
Comment by: THINK / Musings » Blog Archive » Scraping data and API’s

[...] I was wondering how easy it would be to build a generic approach to opening up API’s on web sites who didnt formally publish them and then last night I saw this post about scrAPI’s.    Great stuff—would like to be able to cut and paste data sources and mix them together myself.   I find myself doing manually today too often (eg: the other night I was cutting and pasting rotten tomatoes reviews vs. a movie database).   So many mashup’s today and based on geo location data—its like my one year old who has six or seven words, most everything is at some point “hot”.    Latitude and longitude are just the easiest and first data source to be mined—things are going to get a lot more interesting as the data sources become increasingly diverse.  I look forward to Muller’s coming posts on the business and legal issues regarding scpAPIng. Posted by John Filed in think, building blocks, API’s [...]

August 18th, 2006
at 1:20 am
Comment by: ProgrammableWeb.com » Blog Archive » Dapps, Ruby and Microformats

[...] The very interesting Dapper service officially launched yesterday. It is designed to allow anyone, including non-coders, to create an API for any web site (akin to earlier discussions here about Scrapis). You can use their GUI or use an SDK. Sample services include Magg a movie aggregator an Blotter which graphs blogs over time. This service is now listed here at ProgrammableWeb. [...]

January 18th, 2012
at 11:49 pm
Comment by: Medical Coding Salary

Hey there! I’ve been following your blog for a long time now and finally got the bravery to go ahead and give you a shout out from Lubbock Tx! Just wanted to say keep up the excellent work!

Follow the PW team on Twitter

ProgrammableWeb
APIs, mashups and code. Because the world's your programmable oyster.

John Musser
Founder, ProgrammableWeb

Adam DuVander
Executive Editor, ProgrammableWeb. Author, Map Scripting 101. Lover, APIs.