Rogue Scholar

Published September 18, 2015

Currently in classes where I teach the basics of tree building, we still fire up ancient iMacs, load up MacClade, and let the students have a play. Typically we give them the same data set and have a class competition to see which group can get the shortest tree by manually rearranging the branches. It’s fun, but the computers are old, and what’s nostalgic for me seems alien to the iPhone generation.

BHLCrossrefDataCiteDOIPlaziComputer and Information Sciences

On having multiple DOI registration agencies for the same journal

https://doi.org/10.59350/h4y05-k7j13

Published September 17, 2015

Author Roderic Page

On Friday I discovered that BHL has started issuing CrossRef DOIs for articles, starting with the journal Revue Suisse de Zoologie . The metadata for these articles comes from BioStor. After a WTF and WWIC moment, I tweeted about this, and something of a Twitter storm (and email storm) ensued: To be clear, I'm very happy that BHL is finally assigning article-level DOIs, and that it is doing this via CrossRef.

FacebookKnowledge GraphNatural Language QueriesSocial GraphUnicornComputer and Information Sciences

Possible project: natural language queries, or answering "how many species are there?"

https://doi.org/10.59350/q9n0t-zx480

Published September 11, 2015

Author Roderic Page

Google knows how many species there are. More significantly, it knows what I mean when I type in "how many species are there". Wouldn't it be nice to be able to do this with biodiversity databases? For example, how many species of insect are found in Fiji? How would you answer this question? I guess you'd Google it, looking for a paper.

WikidataWikipediaComputer and Information Sciences

Wikidata, Wikipedia, and #wikisci

https://doi.org/10.59350/es31s-h8v03

Published September 7, 2015

Author Roderic Page

Last week I attended the Wikipedia Science Conference (hashtag: #wikisci) at the Wellcome Trust in London. it was an interesting two days of talks and discussion. Below are a few random notes on topics that caught my eye. What is Wikidata? A recurring theme was the emergence of Wikidata, although it never really seemed clear what role Wikidata saw for itself.

AnnotationBHLBioStorHypothes.isComputer and Information Sciences

Hypothes.is revisited: annotating articles in BioStor

https://doi.org/10.59350/5d5x5-qzk46

Published September 2, 2015

Author Roderic Page

Over the weekend, out of the blue, Dan Whaley commented on an earlier blog post of mine (Altmetrics, Disqus, GBIF, JSTOR, and annotating biodiversity data. Dan is the project lead for hypothes.is, a tool to annotate web pages.

DNA BarcodingEnvironmental DNAComputer and Information Sciences

Dark taxa, drones, and Dan Janzen: 6th International Barcode of Life Conference

https://doi.org/10.59350/5xxw2-yk380

Published September 1, 2015

Author Roderic Page

A little over a week ago I was at the 6th International Barcode of Life Conference, held at Guelph, Canada. It was my first barcoding conference, and was quite an experience. Here are a few random thoughts. Attendees It was striking how diverse the conference crowd was. Apart from a few ageing systematists (including veterans of the cladistics wars), most people were young(ish), and from all over the world.

NamestreamPossible ProjectTaxonomic NamesComputer and Information Sciences

Possible project: NameStream - a stream of new taxonomic names

https://doi.org/10.59350/68pt9-7zv22

Published August 14, 2015

Author Roderic Page

Yet another barely thought out project, although this one has some crude code. If some 16,000 new taxonomic names are published each year, then that is roughly 40 per day. We don't have a single place that aggregates these, so any major biodiversity projects is by definition out of date. GBIF itself hasn't had an update list of fungi or plant names for several years, and at present doesn't have an up to date list of animal names.

Possible ProjectPubMed CentralComputer and Information Sciences

Possible project: A PubMed Central for taxonomy

https://doi.org/10.59350/837e3-9k809

Published August 14, 2015

Author Roderic Page

I need more time to sketch this out fully, but I think a case can be made for a taxonomy-centric (or, perhaps more usefully, a biodiversity-centric) clone of PubMed Central. Why? We already have PubMed Central, and a European version Europe PubMed Central, and the content of Open Access journals such as ZooKeys appears in both, so, again, why?

BHLCloudantCouchDBDjVuSearchComputer and Information Sciences

Demo of full-text indexing of BHL using CouchDB hosted by Cloudant

https://doi.org/10.59350/4crdc-fm682

Published August 10, 2015

Author Roderic Page

One of the limitations of the Biodiversity Heritage Library (BHL) is that, unlike say Google Books, its search functions are limited to searching metadata (e.g., book and article titles) and taxonomic names. It doesn't support full-text search, by which I mean you can't just type in the name of a locality, specimen code, or a phrase and expect to get back much in the way of results.

ISNIORCIDPossible ProjectWikipediaComputer and Information Sciences

Possible project: mapping authors to Wikipedia entries using lists of published works

https://doi.org/10.59350/25rrm-gaj56

Published August 10, 2015

Author Roderic Page

One of the less glamorous but necessary tasks of data cleaning is mapping "strings to things", that is, taking strings such as "George A. Boulenger" and mapping them to identifiers, such as ISNI: 0000 0001 0888 841X. In case of authors such as George Boulenger, one way to do this would be through Wikipedia, which has entries for many scientists, often linked to identifiers for those people (see the bottom of the Wikipedia page for George A.

iPhylo

Towards an interactive web-based phylogeny editor (à la MacClade)

On having multiple DOI registration agencies for the same journal

Possible project: natural language queries, or answering "how many species are there?"

Wikidata, Wikipedia, and #wikisci

Hypothes.is revisited: annotating articles in BioStor

Dark taxa, drones, and Dan Janzen: 6th International Barcode of Life Conference

Possible project: NameStream - a stream of new taxonomic names

Possible project: A PubMed Central for taxonomy

Demo of full-text indexing of BHL using CouchDB hosted by Cloudant

Possible project: mapping authors to Wikipedia entries using lists of published works