Sunday, June 14, 2009

The Community is Dead

This may not be much of a relevation to many, but is a notion that is sinking home more deeply for me of late. By "Community", I don't necessarily mean the online community, though there are hints of that as well when you think of the MySpace->Facebook->Twitter progression from all-out friend fest to ever more insular & individualistic directions, I mean the taxonomic community.

I lead the LifeDesk application of the Encyclopedia of Life and have been trying to sell the notion of a taxa-centric "community" of taxonomists that have a desire to get their content online in a human and machine readable format. Banding together means the workload can be shared. i.e. you gather the images, I'll get the text, she'll get the names in order, and he'll get the bibliography, etc. etc. This is a similar approach behind the Scratchpad philosophy. [Aside: there are apparently some who think Scratchpads and LifeDesks are duplicating efforts, but nothing could be further from the truth. Having both means choice and that is a good thing because it strengthens both our directions and is a clear signal to taxonomists that there is something behind this.] While the Scratchpad/LifeDesk community-driven focus may work in a number of situations, it is by no means the rule. Rather, the chances are much greater that taxonomists don't have a taxa-centric community of colleagues to share the workload because in fact, they may be the only one in the world working on their chosen taxa. As a result, the majority of Scratchpads and LifeDesks will be "communities" of single individuals. So, I have been thinking a little more deeply about the Scratchpad/LifeDesk direction and think I see a way forward.

The clear signal from the Scratchpad/LifeDesks projects is that folks are doing primarily two things: 1) getting a biblio online, and 2) getting taxonomic names in order. These two activities are largely divorced from one another because the workflow leaves a lot to be desired. Both activities are thankless tasks to begin with regardless of the LifeDesks/Scratchpad environment, which adds further insult to the workflow. Why should these activities be so independent from one another? Here's what the workflow ought to be:

1. Upload PDF reprints
2. Look for a DOI & get the metadata from CrossRef. If none found, prompt with citation form (first check for existing paper in db to cut down on duplicates)
3. Scan the PDFs using TaxonFinder
4. Present flat lists of names found in individual PDFs
5. Drag these into jsTree-based classification manager while retaining the name-reprint link in the background

This is the workflow that makes sense because when building a classification, one necessarily starts with publications, not some mythical list of names.


Does the above make sense in a LifeDesk or a Scratchpad? It could certainly be a cool tool to help lower the bar of entry, but I seriously doubt it would get the traction in the taxonomic "community" that the tool would deserve. Rather, the application is best placed on the desktop as a rich, cross-platform app in Adobe Air or similarly facile environment to develop. Roll in some Bittorrent capabilities (ee gads!) and you have the start to a mechanism whereby reprints, names AND classifications may be shared and one could walk among the three in various ways. It would work because taxonomists need reprints and names AND there are plenty of residual names in any one reprint (i.e. of use to someone else). If cleverly constructed, reconciliation of names is an insular exercise that happens on the desktop (as it always has been) but the sharing of these reconciliation groups / biblio metadata acts to enhance the findability of reprints.

Here's the challenge then. Build a service that accepts PDF reprints, finds the DOI (if present) & spits back the citation metadata for the article AND all the names (dedup'd and cleaned) they contain. I don't don't need any more taxonomic intelligence than that. Give it to me in JSON and I can whip up the jsTree-based interface to help individuals build their own reconciliation groups...all linked to reprints in their store.