Monday, July 1, 2013

NameSpotter: Experiences Making a Google Chrome Extension

If you believe various browser penetration statistics, Google Chrome is more popular than any other browser in use today. And, with the imminent demise of the popular iGoogle homepage later this year, my suspicion is that Chrome apps and extensions will grow in popularity...something is no doubt in the works in Mountain View. Many of these apps and extensions are freely available on the Chrome Web Store. Some time ago I decided to poke around and learn what it takes to develop a Google Chrome extension. I was pleasantly surprised at how easy it was. It's a good time to finally write about my experience, especially since I spent a few hours yesterday renewing an extension I designed called NameSpotter, whose code is on GitHub. The first part of this post is a description of what my NameSpotter extension does and the second part is cursory background to Chrome Extension authoring.

What Does the NameSpotter Extension Do?

The NameSpotter Chrome extension puts a Morpho butterfly icon  in your browser toolbar. Click it and its wings start to flap while the content of the current web page under view is sent to the Global Names Recognition and Discovery web service where two scientific name-finding engines, TaxonFinder and NetiNeti take action. A list of scientific names returns and these are highlighted on the page. Run your cursor over a name and a tooltip draws-up content about that name from the Encyclopedia of Life. There's a resizable navigation panel that appears at the bottom of the page that allows you to jump your position on the page from name to name or to copy all or some of the found names into your clipboard for pasting elsewhere. Many of these features are customizable from a settings area in the navigation panel. For example, if you prefer not to have tooltips, you can turn those off.

The interface is in English or French, depending on your system settings. And, only common names (when known by EOL) in the language of your system settings are shown. If EOL had an API that accepted locale as a parameter, I'd use that too. As it stands, if your system settings are in French, you're going to get English content in your tooltips.

If you happend to be viewing a PDF while in Chrome and click the Morpho icon in your browser toolbar, you get a panel at the bottom of the page as usual but you don't get the tooltips. PDFs (sadly) are not HTML so there's nothing I can do to manipulate the static content you're reading.

There's quite a bit of action in this extension with many messages that get passed around from server to your browser and within the extension itself. I also made use of two very excellent jQuery plugins called Highlight and Tooltipster. The most difficult aspect to handle was making sure the extension performs well, especially when there might be hundreds if not thousands of names on the page. This is where a little knowledge about the performance of jQuery selectors comes in handy.

What's Next?

There are plenty of other aspects to this extension that could be explored such as:
  • Auto-indexing URLs and scientific names: Any click of the Morpho that results in found names could send the URL in the browser bar and the names found to an index without the user knowing. This would be equivalent to crowd-sourced web spidering. An aggregator of content like EOL might be very interested to receive URL + name combinations to make an auto-generating outlinks section.
  • Other sources of content in tooltips: I designed the tooltip to be (relatively) flexible. If there are other resources that accept a scientific name as a query parameter in a JSON-based RESTful API, I could wire-up its responses. If you are more interested in nomenclatural acts, I could for example use ZooBanks APIs. Or, I could send the namestring to CrossRef's API and pull back some recent publications. There's really no limit to sources of data. What's limiting is my appreciation what's useful in this framework.
  • Sending annotations to the host: This one's a bit half-baked, but why can't a Google Chrome extension be used to push annotations, questions, comments to the host website? You'd need a bit of OAuth and something on the web page to inform the extension that the host is willing to accept annotations and where to send them. Something like webhooks come to mind.
Do you have other suggestions?

How Do You Make a Chrome Extension?

Google Chrome extensions are remarkably simple, based solely on HTML, css and JavaScript, the basic tools of web page development. In contrast, FireFox and Internet Explorer extensions are horribly complex and their documentation for first-timers is equally terrible. The documentation for Chrome extensions is wonderful with plenty of tutorials and free samples. Development is made especially easy because you can load an "unpacked" extension, tweak it, reload it, and iterate with a few clicks from your Chrome extensions page while in developer mode.

Chrome extensions have three basic parts: 1) Metadata file, 2) Content scripts and, 3) Background pages/Event scripts and a very important construct: Passing Messages.

Parts of a Chrome Extension

Metadata File

The metadata file is a static JSON document called the Manifest and it contains basic information about the extension such as a title, description, default locale (language) as well as more complex concepts such as permissions and the local JavaScript and css files the extension will use. As I learned the hard way, you cannot put bits in your manifest (eg. configuration variables) that Google doesn't expect to see. So, if you have a need for configuration variables as I did, you have to do a bit of AJAX to grab the contents of your static file:

nsbg.loadConfig = function() {
  var self = this,
      url = chrome.extension.getURL('/config.json');

    type : "GET",
    async : false,
    url : url,
    success : function(data) {
      self.config = $.parseJSON(data);

Content Script(s)

Content Scripts are the JavaScript and css files that you want injected into the web page. They run in the context of web pages, but are encapsulated in their own space. You cannot execute JavaScript functions that might already be declared in the source of a web page. Besides, how would an extension know that such a function could be executed? Nonetheless, if you need jQuery or any other library to write your content scripts, drop 'em in a folder in your extension and declare 'em in your manifest. It is that simple. Content scripts do however have access to the DOM of web pages. So, you can modify links, access the pictures, or any other content on the web page via your content script.

Background Page / Event Script(s)

Background pages do as their name suggests - they run in the background and don't require user interaction to do so.  Event scripts are similar to background pages but are more friendly toward system resources because they can free memory when not needed. Why do you need background or event scripts? A good candidate for a background page is material for a user interface. Another important reason for background pages or event scripts is that these have capabilities that content scripts to do not, eg. access to the context menu or bookmarks. You don't always need a content script, but you always need a background page or event script. If you do have a content script, a good rule of thumb is to keep it mean and lean and dump the heavy lifting into a background or event script.

Passing Messages

This is the most difficult part of a Chrome extension, but powerful once you understand why it's needed. Background scripts have access to system-like functions that content scripts do not and content scripts respond to user interaction whereas background scripts (mostly) cannot. You bridge the two worlds by passing messages.

Here's a method in a content script that broadcasts a message:

chrome.extension.sendMessage(/* JSON message */, function(response) {
  //do something with response

...and the background/event script that listens for broadcasted messages and responds back.

chrome.extension.onMessage.addListener(function(request, sender, sendResponse) {
  //do something
  sendResponse(/* response body */);

I used the word "broadcast" because as you see from the above, there's nothing that indicates who sent the message or what it might contain. You avoid clashes with other installed modules that also use messages by constructing the body of your messages with care. In my case, I construct messages in my content scripts to contain the equivalent of a title in addition to a body so I know I'm the one who sent the message:

{ "message" : "ns_clipBoard", "content" : "my stuff" }