Informatique et sciences de l'informationAnglaisBlogger

iPhylo

Rants, raves (and occasionally considered opinions) on phyloinformatics, taxonomy, and biodiversity informatics. For more ranty and less considered opinions, see my Twitter feed.ISSN 2051-8188. Written content on this site is licensed under a Creative Commons Attribution 4.0 International license.
Page d'accueilFlux AtomMastodonISSN 2051-8188
language
BHLBibliographiesBioguidEOLJSONInformatique et sciences de l'informationAnglais
Publié

One thing about the Encyclopedia of Life which bugs me no end is the awful way it displays the bibliography generated from the Biodiversity Heritage Library (BHL). The image on the right shows the bibliography for the frog Hyla rivularis Taylor, 1952. It's one long, alphabetical list of pages. How can a user make sense of this?

CitationData QualityGoogle ScholarMendeleyMetadataInformatique et sciences de l'informationAnglais
Publié

Hot on the heels of Geoffrey Nunberg's essay about the train wreck that is Google books metadata (see my earlier post) comes Google Scholar’s Ghost Authors, Lost Authors, and Other Problems by Péter Jacsó. It's a fairly scathing look at some of the problems with the quality of Google Scholar's metadata.Now, Google Scholar isn't perfect, but it's come to play a key role in a variety of bibliographic tools, such as Mendeley, and Papers.

MediawikiSemantic WebTreeBASEWikiWorkshopInformatique et sciences de l'informationAnglais
Publié

At the start of this week I took part in a biodiversity informatics workshop at the Naturhistoriska riksmuseets, organised by Kevin Holston. It was a fun experience, and Kevin was a great host, going out of his way to make sure myself and other contributors were looked after.

AntsHistory FlowPyramicaStrumigenysWikipediaInformatique et sciences de l'informationAnglais
Publié

Stumbled across Alex Wild's post Pyramica vs Strumigenys : why does it matter?, which takes as it's starting point a minor edit war on the Wikipedia page for Pyramica . Alex gives the background to the argument about whether Pyramica is a synonym of Strumigenys , and investigates the issue using the surprisingly small about of data available in GenBank.

Gene WikiGoogleWikipediaInformatique et sciences de l'informationAnglais
Publié

Andrew Su has posted an analysis of Gene Wiki, a project to provide Wikipedia pages on every human gene:This result is interesting in that an existing resource (Gene Cards) beats Wikipedia, but only just.

History FlowSVGVisualisationWikipediaInformatique et sciences de l'informationAnglais
Publié

Quick post (really should be doing something else). Reading Jeff Atwood's post Mixing Oil and Water: Authorship in a Wiki World lead me to IBM's wonderful history flow tool to visualise the edit history of a Wikipedia page. There's a nice paper describing history flow (doi:10.1145/985692.985765, free PDF here). Inspired by this I decided to try and implement history flow in PHP and SVG.

GoogleWikipediaInformatique et sciences de l'informationAnglais
Publié

Given that one response to my post on Fungi in Wikipedia was to say that fungi are also charismatic, so maybe I should try [insert unsexy taxon name here]. So, I've now looked at all the species I extracted from Wikipedia (nearly 72,000), ran the Google searches, and here are the results:SiteHow many times is it the top

FungiGoogleSearchWikipediaInformatique et sciences de l'informationAnglais
Publié

One response to the analysis I did of the Google rank of mammal pages in Wikipedia is to suggest that Wikipedia does well for mammals because these are charismatic. It's been suggested that for other groups of taxa Wikipedia might not be so prominent in the search results.As a quick test I extracted the 1552 fungal species I could find in Wikipedia and repeated the analysis.

Clay ShirkyEOLGooglePower LawSearchInformatique et sciences de l'informationAnglais
Publié

One assumption I've been making so far is that when people search for information on an organism using its scientific name, Wikipedia will dominate the search results (see my earlier post for an example of this assumption). I've decided to quantify this by doing a little experiment. I grabbed the Mammal Species of the World taxonomy and extracted the 5416 species names. I then used Google's AJAX search API to look up each name in Google.