Ciências da Computação e da InformaçãoInglêsBlogger

iPhylo

Rants, raves (and occasionally considered opinions) on phyloinformatics, taxonomy, and biodiversity informatics. For more ranty and less considered opinions, see my Twitter feed.ISSN 2051-8188. Written content on this site is licensed under a Creative Commons Attribution 4.0 International license.
Pagina inicialFeed AtomMastodonISSN 2051-8188
language
"author Names"ClusteringData QualityNCBITaxonomic NameCiências da Computação e da InformaçãoInglês
Publicados

As part of my Quixotic attempt to construct a wiki of taxonomic names, I'm building a database of names and links. My current plan is to seed this with the NCBI taxonomy. What I want to do is flesh out the NCBI taxonomy with authorities and links to the original literature. At the moment the NCBI taxonomy is almost "nude", lacking links to the literature behind the names.

BHLLibraryMuseumNHMPresentationCiências da Computação e da InformaçãoInglês
Publicados

Vince Smith has produced a nice flyer for my forthcoming talk at The Natural History Museum on March 17th (11-12). It will be a busy day as I'm also talking at the British Library in the evening (6pm - 8:30pm), for which Sarah Kemmitt has produced a flyer, and set up a discussion forum on Nature Network. With all this effort going into the artwork, I'd better actually come up with something useful to say.

GUIDsSemantic WebTaxonomyWikiCiências da Computação e da InformaçãoInglês
Publicados

Reading a recent TAXACOM thread (Species Pages - purpose) my sense is that some people are arguing that "species pages" would be time consuming to create, aren't much good for taxonomists (to quote Mike Dallwitz "In brief, to make simplified and attractive information about taxa easily available to casual users?"), and nobody gets credit for making them.

OTUPhylogenyWikiCiências da Computação e da InformaçãoInglês
Publicados

Another issue I'm trying to get my head around is how to deal with labels in phylogenies. These can be any number of things, such as GenBank sequences, specimen codes, taxon names, abbreviations of taxon names, laboratory codes, etc. Here's my quick attempt to model these: This sketches various levels of indirection to go from a label in a tree to a taxon name.

HostModellingParasiteTaxonomic ConceptWikiCiências da Computação e da InformaçãoInglês
Publicados

I rather skirted around the notion of "taxonomic concepts" in the previous post, partly because it's easy to end up with trying to have a concept for each utterance every made by a taxonomist, and that doesn't seem, er, scalable. So, I have a more limited view of a taxonomic concept, namely a name attached to some data.

ClassificationDesignModellingTaxonomyWikiCiências da Computação e da InformaçãoInglês
Publicados

Modelling taxa is a bit trickier. I've sketched my ideas for distinguishing name strings and taxonomic names earlier. That's the easy stuff. What about "taxonomic concepts" and "OTUs"? As a first pass, I'm looking at linking taxon names to classifications via GUIDs.

DesignModellingWikiCiências da Computação e da InformaçãoInglês
Publicados

Time to make some notes. I've been playing with using Sematic Mediawiki to create a database of taxonomic names, literature, specimens, sequences, and phylogenies. One challenge is to come up with simple ways to model these entities, in a way that makes both data entry simple and querying as simple as possible. Some things are straightforward. For example, a publication can be modelled like this: OK, I've ignored the attributes.

BBCCreative CommonsPhylogenyTree Of LifeVisualisationCiências da Computação e da InformaçãoInglês
Publicados

Last night BBC One aired David Attenborough's Charles Darwin and the Tree of Life, which featured a lovely "fly through" the tree of life: In conjunction with the TV show, the Wellcome Trust has launched the Interactive Tree of Life, a Flash-based view of the tree of life. There's also a blog about the project. Here's a demo of the tree: The tree looks very nice, and a lot of work has gone into it, but I am somewhat underwhelmed.

DarwinEvolDirRSSTwitterCiências da Computação e da InformaçãoInglês
Publicados

Well, not Darwin himself, exactly. The Evolution Directory (better known as "EvolDir") is a mailing list run by Brian Golding at McMaster University, Ontario. It's widely used by evolutionary biologists to post announcements about jobs, courses, conferences, software, and other topics of interest to the community.

"author Names""web Service"BibliometricsBioguidCiências da Computação e da InformaçãoInglês
Publicados

One problem I've encountered in building a bibliographic database is the different ways author names are written. For example, for papers I've authored my name may be written as "Roderic D. M. Page" or "R. D. M. Page". Googling about this problem I came across Dror Feitelson's paper On identifying name equivalences in digital libraries.

MediawikiScratchpadsCiências da Computação e da InformaçãoInglês
Publicados

Yes, I know this is ultimately a case of the "genius of and", but the more I play with the Semantic Mediawiki extension the more I think this is going to be the most productive way forward. I've had numerous conversations with Vince Smith about this. Vince and colleagues at the NHM have been doing a lot of work on "Scratchpads" -- Drupal based webs sites that tend to be taxon-focussed.