Scienze informatiche e dell'informazioneIngleseBlogger

iPhylo

Rants, raves (and occasionally considered opinions) on phyloinformatics, taxonomy, and biodiversity informatics. For more ranty and less considered opinions, see my Twitter feed.ISSN 2051-8188. Written content on this site is licensed under a Creative Commons Attribution 4.0 International license.
Pagina inizialeAtom ForaggioMastodonISSN 2051-8188
language
Clay ShirkyEOLGooglePower LawSearchScienze informatiche e dell'informazioneInglese
Pubblicato

One assumption I've been making so far is that when people search for information on an organism using its scientific name, Wikipedia will dominate the search results (see my earlier post for an example of this assumption). I've decided to quantify this by doing a little experiment. I grabbed the Mammal Species of the World taxonomy and extracted the 5416 species names. I then used Google's AJAX search API to look up each name in Google.

ClassificationMammal Species Of The WorldMammalsMSWWikipediaScienze informatiche e dell'informazioneInglese
Pubblicato

Continuing the saga of making sense of the mammal classification in Wikipedia, I've done a quick comparison with the Mammal Species of the World (third edition) classification. MSW is the default taxonomic reference used by WikiProject Mammals.

ClassificationMammalsVisualisationWikipediaScienze informatiche e dell'informazioneInglese
Pubblicato

Following on from my previous post about visualising the mammalian classification in Wikipedia, I've extracted the largest component from the graph for all mammal taxa in Wikipedia, and it is a tree. This wasn't apparent in the previous diagram, where the component appeared as a big ball due to the layout algorithm used.

Australian Systematic BotanyCitationCitation NeededImpact FactorNuytsiaScienze informatiche e dell'informazioneInglese
Pubblicato

While thinking about measuring the quality of Wikipedia articles by counting the number of times they cite external literature, and conversely measuring the impact of papers by how many times they're cited in Wikipedia, I discovered, as usual, that somebody has already done it. I came across this nice paper by Finn Årup Nielsen (arXiv:0705.2106v1) (originally published in First Monday as a HTML document, I've embedded the PDF from arXiv

BioguidFutureISpeciesMashupPlansScienze informatiche e dell'informazioneInglese
Pubblicato

What follows are some random thoughts as I try and sort out what things I want to focus on in the coming days/weeks. If you don't want to see some wallowing and general procrastination, look away now.I see four main strands in what I've been up to in the last year or so:servicesmashupswikisphyloinformaticsLet's take these in turns. Services Not glamourous, but necessary.

GBIFGUIDsLinked DataScienze informatiche e dell'informazioneInglese
Pubblicato

At the end of day two of the GBIF LSID-GUID Task Group I put together this crude diagram to summarise some of the possible links between biodiversity data and the larger linked data cloud, which I, among others, have argued is where biodiversity informatics should be heading. Here's my hastily put together diagram (created using the wonderful OmniGraffle):I've put GBIF at the centre since we're at GBIF, and it's them we are trying to convince.

GeoreferencingGeoRSSGoogle MapsRSSWikispeciesScienze informatiche e dell'informazioneInglese
Pubblicato

Following on from my previous post about Wikispecies (which generated some discussion on TAXACOM) I've played some more with Wikispecies. AS a first step I've added a Wikispecies RSS feed to my list of RSS feeds. This feed takes the original Wikispecies RSS feed for new pages (generated by the page Special:NewPages ) and tries to extract some details before reformatting it as an ATOM feed.

DatabaseWikispeciesScienze informatiche e dell'informazioneInglese
Pubblicato

This post was prompted by Stephen Thorpe's post on TAXACOM about Wikispecies in which he wrote (in a thread discussing Roger Hyam's recent blog post) thatI beg to differ. Wikispecies runs on a database (the Mediawiki software uses a database to store the wiki), and Mediawiki can be thought of as a database of semi-structured text, but it lacks a lot of the functionality database users would expect.