Wednesday, October 3, 2007

The Open Library


I stumbled on an amazing new project lead by Aaron Swartz called the Open Library - not be confused with this Open Library though there appears to be some resemblance. What strikes me about Aaron's project is that it is so relevant to The Encyclopedia of Life it scares me that I haven't yet heard of it. According to their "About the technology" page:

Building Open Library, we faced a difficult new technical problem. We wanted a database that could hold tens of millions of records, that would allow random users to modify its entries and keep a full history of their changes, and that would hold arbitrary semi-structured data as users added it. Each of these problems had been solved on its own, but nobody had yet built a technology that solved all three together.

The consquence of all this is that there is a front-facing page for every book with the option to edit the metadata. All versioning and users are tracked. The content of the "About Us" page sounds eerily like E. O. Wilson's proclamations in his 2003 opinion piece in TREE (doi:10.1016/S0169-5347(02)00040-X). For those of you who don't recognize the name, Aaron Swartz, he's the whiz behind a lot of important functionality on the web we see today. It's also worth reading his multi-part thoughts on the spirit of Wikipedia and why it may soon hit a wall.

3 comments:

Taneya said...

Hi David - the two Open Library's are in fact the same. Great post.

Rod Page said...

Well, Open Library aren't the only ones doing this. Freebase is a similar project, albeit on a bigger scale and with more bells and whistles, including user-editable types. There's not a huge amount of new content to see (i.e., content not derived from Wikipedia), but the Polar bear page is a simple example.

Rod Page said...

I'd been playing with a key-value database as a way to store data about specimens, sequences, publications, trees, etc. Turns out that this approach has a big literature, especially in medical bioinformatics. There's a nice article on Wikipedia under the title Entity-Attribute-Value model, and I've bookmarked a few papers on Connotea (tagged entity–attribute–value).