Computer and Information SciencesBlogger

iPhylo

Rants, raves (and occasionally considered opinions) on phyloinformatics, taxonomy, and biodiversity informatics. For more ranty and less considered opinions, see my Twitter feed.ISSN 2051-8188. Written content on this site is licensed under a Creative Commons Attribution 4.0 International license.
Home PageAtom FeedMastodonISSN 2051-8188
language
BioStorLuceneSearchSolrComputer and Information Sciences
Published

Prompted by the appearance on the BHL blog of an article about BioStor I've thinking about how to improve what is basically a fairly clunky tool.One major weakness is searching the collection of nearly 40,000 articles extracted from BHL. Note the word "extracted." BioStor isn't a tool like PubMed or Google Scholar where the goal is to find articles on a topic.

APIAuthorshipHackHack4knowledgeIdentityComputer and Information Sciences
Published

Inspired by the forthcoming Hack4Knowledge I've put together a service that enables you to assert that you are the author of a paper using the Mendeley API.If you are impatient, give it a try at: http://iphylo.org/~rpage/hack4knowledge/iwrotethat/To use it you need a Mendeley account. When you go to I wrote that you will be asked to connect to your Mendeley account.

BHLBioStorCouchDBPubMed CentralReplicationComputer and Information Sciences
Published

Last December I released a web site called Australian Faunal Directory on CouchDB, which was part of my ongoing exploration of how to build a simple yet useful database of taxonomic names. In particular, I want to link names directly to the primary taxonomic literature.

BioStorBMC BioinformaticsGoogle ScholarPublishedComputer and Information Sciences
Published

My article describing BioStor — "Extracting scientific articles from a large digital archive: BioStor and the Biodiversity Heritage Library" — has finally seen the light of day in BMC Bioinformatics (doi:10.1186/1471-2105-12-187, the DOI is not working at the moment, give it a little while to go live, meantime you can access the article here).Getting this article published was more work than I expected.

BackgroundBHLBioStorDjVuRTFMComputer and Information Sciences
Published

One of the many biggest challenges I've faced with the BioStor project, apart from dealing with messy metadata, has been handling page images. At present I get these from the Biodiversity Heritage Library. They are big (typically 1 Mb in size), and have the caramel colour of old paper.

CitationDataDryadComputer and Information Sciences
Published

Interest in archiving data and data publication is growing, as evidenced by projects such as Dryad, and earlier tools such as TreeBASE. But I can't help wondering whether this is a little misguided. I think the issues are granularity and reuse.Taking the second issue first, how much re-use do data sets get? I suspect the answer is "not much". I think there are two clear use cases, repeatability of a study, and benchmarks.

MendeleyWeb HooksComputer and Information Sciences
Published

Quick, poorly thought out idea. I've argued before that Mendeley seems the obvious tool to build a "bibliography of life." It has pretty much all the features we need: nice editing tools, support for DOIs, PubMed identifiers, social networking, etc.But there's one thing it lacks. There's not an easy way to transmit updates from Mendeley to another database.