A quick, and not altogether satisfactory hack, but I've added a simple interactive treemap to BioStor. It's essentially a remake of the Catalogue of Life treemap I created in 2008, but coloured by the number of references I've extracted from BHL.
A quick, and not altogether satisfactory hack, but I've added a simple interactive treemap to BioStor. It's essentially a remake of the Catalogue of Life treemap I created in 2008, but coloured by the number of references I've extracted from BHL.
Jim Croft drew my attention to a cool crowd-sourcing project to convert scans of Australia's newspapers to text. The site has a nice chart showing the projects' coverage of the Australian newspapers, which motivated me to show something similar for BioStor.
My BioStor project has reached over 13,000 articles, making it a sizeable respository of open access articles on biodiversity. It's still a tiny fraction of what could be extracted from the Biodiversity Heritage Library (BHL), but perhaps it's worth taking stock of what's there. Coverage One pleasing discovery is that, despite the 1923 cut-off due to U.S. copyright, BHL contains a lot of post-1923 articles.
Given that a new decade prompts predictions, as well as New Year's resolutions, and that 2010 is the International Year of Biodiversity, which comes complete with glossy web sites and calls for action, I'm making some predictions of my own, inspired in part by Eric Hellman's Ten Predictions for the Next Ten Years. I won't be nearly as bold as Eric, I'm limiting myself to biodiversity informatics, and the coming year.
Today I finally got a project out the door. BioStor is my take on what an interface to theBiodiversity Heritage Library (BHL) could look like. It features the visualisations I've mentioned in earlier posts, such as Google maps based on extracted localities, and tag trees.
I've been buried in programming (and it's exam time at Glasgow) so I've not blogged for a month (gasp). I've been playing with ways to visualise Biodiversity Heritage Library content for a while (click here for a list of previous posts), and have occasionally surfaced to tweet a screenshot via twitpic.
I've added a feature to my Biodiversity Heritage Library viewer that should help make sense of the names found on a page. Until now I've displayed them as a list of "tags", which ignores the relations among the names.
Sadly I won't be at TDWG 2009, at least not in person. However, there is a session on wikis, which may contain this brief screencast of my iTaxon experiments. The screencast was made in haste, but tries to convey some of the ideas behind these experiments, especially the idea that by linking data together we can generate more interesting and rich views of objects such as scientific publications.
One of the more glaring limitations of my BHL viewer described in the previous post is that it can take a while to load all the page thumbnails (there can be hundreds). Given that one of the original motivations for this project was a faster viewer, this kinda sucks.
In between the chaos that is term-time I've been playing with ways to view Biodiversity Heritage Library content. The viewer is crude, and likely to go off-line at any moment while I fuss with it, the you can view an example here.
Continuing with my exploration of the Biodiversity Heritage Library one obstacle to linking BHL content with nomenclature databases is the lack of a consistent way to refer to the same bibliographic item (e.g., book or journal). For example, the Amphibia Species of the World (ASW) page for Gastrotheca aureomaculata gives the first reference for this name as: Gastrotheca aureomaculata Cochran and Goin, 1970, Bull. U.S. Natl.