Thursday, May 10, 2007

Encyclopedia of Life

The Encyclopedia of Life was officially launched yesterday and I have been reading various public postings to guage response.

Of all these I have tried to scan, Slashdot is perhaps the most active and most informative. So, here is my best attempt at summarizing public reception:

1. There's a notable lack of understanding for how this will be different from WikiSpecies. There is little to no appreciation for the challenge of names management of the kind spearheaded by uBio. This is a critical piece of the puzzle that cannot be done in a 2-dimensional wiki environment.

2. Most people understand that content will be developed by first "harvesting" material scattered on the Internet then cleaned-up by "scientists". But there has been little to no discussion on how that will be accomplished or how accreditation will be maintained.

3. There hasn't yet been much discussion on the TAXACOM or ENTOMO-L listservs about the encyclopedia. A few suggested that monies would be better put into pure systematics rather than into a "bean-counting" exercise. Others recognize that the content will necessarily be created by systematists, but see that there is as yet no incentive to do so. Millions of $$ are dumped into scanning materials, wages, etc. but how does that filter down to individual contributors?

4. The "MyEoL" vs. the canonical "EoL" content is not well appreciated.

5. Timeframe for "completion" has been grossly misunderstood. Many believe they have to wait for 10 years to see anything come out of EoL.

The EoL Workbench: "The Living Encyclopedia"

While I do think an Encyclopedia of Life will be a most amazing resource, there is a very large & critically missing piece of the puzzle as it relates to #3 in my list above. The EoL promotional video grandly expresses that such an encyclopedia will transform the science of biology. How? What we see to date is a digital version of a paper encyclopedia with a bit of gadgetry to encourage public participation. If this is to transform biology, it must simplify communications or the work flow in day-to-day biological pursuits. This is where my "Living Encyclopedia" idea comes in.

First, the communications required to conduct revisionary work need to be very well understood. For starters, ALL type specimens must first be made available & directly tied to direct channels of communication to the curators charged with maintaining those holdings. The folks at GBIF have taken great first steps, but digital representations of ALL type specimens accessible via DiGIR, TAPIR or other means is no where near complete. Systematists must still scour old literature to learn where type specimens are held, write letters, etc. before acquiring specimens BEFORE any serious revision can be started.

Second, there is currently no effective link between publishers of taxonomic literature & the nomenclators. Before that happens, an encyclopedia will be woefully dated.

So here's the idea...

Integrated into "MyEoL" ought to be a blank slate - an organizational workbench if you will. Here, systematists run very simple web service queries to GBIF to create a visual "cloud" of specimens of interest. This would be much like Yahoo's Pipes. Through drag-and-drop, communications to curators is coordinated for loans. A systematist would continue to use this tool to add/remove specimens according to the concept they are working with, connect pieces of the cloud to other resources like those in GenBank, insert references, etc. all via drag-drop. Upon completion of the work, the circumscribed specimens and the entire visual representation of the workflow is "locked" and a permanent URL is issued, which can and ought to be present in the eventual publication. When the publication is accepted, the systematist then returns to the workbench to insert the publication's DOI. In such a manner, all the digital bits are in place. This permanent URL that describes the circumscription of specimens is then in the public domain such that other systematists may examine the "guts" behind the publication. Once all is said and done, this then becomes the scaffolding for a species page in EoL.

This is vastly different than a purely post-hoc encyclopedia because the incentive is a simplified & accelerated workflow - a much lower-level entry point. So, the amount of teeth-pulling required to build an effective Encyclopedia of Life with content written by the scientific community is vastly reduced. The former model doesn't require accreditation for resources or material but the present model is a mess of very difficult "mash-ups". Sure, a first crack at the encyclopedia can be harvested content, but to be sustainable, it has to either adopt a distasteful monitization scheme (i.e. supported by click-through advertising) or create a low-level, organizational workbench of immense value to the scientific community, very easily expressed to government and public fund agencies like NSF, NSERC, etc.

