Friday, April 27, 2007

Client-orchestrated Data Repurposing

A lot of work is underway by various working groups within TDWG to connect one machine to another for intelligent data exchange systems. For example, DiGIR and BioCASE (soon to superseded by TAPIR) are nifty systems to create on-the-fly XML documents whose data they contain can be dumped into other databases. This of course is all behind-the-scenes with no direct benefit to providers of their biodiversity data except I suppose a demonstration to administrators that they have contributed to the greater good. Eventually, somewhere down the line, there may be some sort of attribution but there's no guarantee.

GBIF does a great job of maintaining attribution because the ultimate goal is to permit someone who uses their website to discover where a specimen can be found & to contact the curator. However, there's nothing stopping anyone from aggregating data from DiGIR providers and repurposing it without any sort of attribution or "link" back to the provider. In other words, an institution could potentially have to cough up a lot of funds to keep the bandwidth pipes flowing and there may not be any immediate value. These sorts of thoughts fly in the face of open access. Don't get me wrong, I'm all for open access, I'm just not certain if such a model for aggregating biodiversity data in museums and elsewhere is sustainable. What is at least needed is an auditing & logging tool associated with DiGIR (or TAPIR) such that providers of biodiversity data may collect data on who used their resource, what was downloaded and what traffic patterns have been like over 'x' number of days, weeks, months, etc. But, I know of no such add-on for DiGIR or BioCASE providers.

So, I have been looking into alternative means to share resources and have played around with various ideas. One such idea takes the form of gadgets to share imagery. There are a ton of really useful images of immense biological value and when these are shared around, it becomes impossible to know where the original image was first made available and who provided it. One could use meta tags and embed that data within the image, but who does that? If there was a browser-based meta tag reader for images for web programmers to tap into, then meta tags would be obvious. However, I'm not aware of any browser plug-in that can do that. So, here's a gadget script that can be copied and pasted onto a web site:

And here's the result:

The gadget itself is dynamically-created JavaScript that pulls all the bits from the Nearctic Spider Database. The species nomenclature, attribution, and link to the species page are automatic & could change should I change anything in the database. These changes would of course cascade through all instances of the script where ever these may be. The individual who "made" the gadget can however pretty it up as they might like through a little configuration tool I have. Feel free to mess with that by clicking a "Link it" button here:

Something like this gadget system is by no means rocket science but has immediate value to the provider and the individual wishing to repurpose it for their web site.

