Informática y Ciencias de la InformaciónInglésBlogger

iPhylo

Rants, raves (and occasionally considered opinions) on phyloinformatics, taxonomy, and biodiversity informatics. For more ranty and less considered opinions, see my Twitter feed.ISSN 2051-8188. Written content on this site is licensed under a Creative Commons Attribution 4.0 International license.
Página de inicioFeed AtomMastodonISSN 2051-8188
language
Bibliography Of LifeCSLElasticSearchJSONJSON-LDInformática y Ciencias de la InformaciónInglés
Publicado

I've released a simple search engine for publications in Wikidata. Wikicite Search takes its name from the WikiCite project, which was an initiative to create a bibliographic database in Wikidata. Since bibliographic data is a core component of taxonomic research (arguably taxonomy is mostly tracing the fate of the "tags" we call taxonomic names) I've spent some time getting taxonomic literature into Wikidata.

CitationCSLMachine LearningParsingInformática y Ciencias de la InformaciónInglés
Publicado

Quick note on a tool I've been working on to parse citations, that is to take a series of strings such as: Möllendorff O (1894) On a collection of land-shells from the Samui Islands, Gulf of Siam. Proceedings of the Zoological Society of London, 1894: 146–156. de Morgan J (1885) Mollusques terrestres & fluviatiles du royaume de Pérak et des pays voisins (Presqúile Malaise). Bulletin de la Société Zoologique de France, 10: 353–249.

C++CloudCompilingHerokuInformática y Ciencias de la InformaciónInglés
Publicado

TL;DR Use a buildpack and set "LDFLAGS=--static" --disable-shared I use Heroku to host most of my websites, and since I mostly use PHP for web development this has worked fine. However, every so often I write an app that calls an external program written in, say, C++. Up until now I've had to host these apps on my own web servers. Today I finally bit the bullet and learned how to add a C++ program to a Heroku-hosted site.

ALABHLBioStorGBIFPlaziInformática y Ciencias de la InformaciónInglés
Publicado

If you compare the impact that BHL and Plazi have on GBIF, then it's clear that BHL is almost invisible. Plazi has successfully in carved out a niche where they generate tens of thousands of datasets from text mining the taxonomic literature, whereas BHL is a participant in name only. It's not as if BHL lacks geographic data.

CitationCRFIdentifiersMachine LearningSpecimensInformática y Ciencias de la InformaciónInglés
Publicado

Note to self. The challenge of finding specimen citations in papers keeps coming around. It seems that this is basically the same problem as finding citations to papers, and can be approached in much the same way. If you want to build a database of reference from scratch, one way is to scrape citations from papers (e.g., from the "literature cited" section), convert those strings into structured data, and add those to your database.

Catalogue Of LifeGraphvizSummary TreesVisualisationInformática y Ciencias de la InformaciónInglés
Publicado

How to cite: Page, R. (2021). Maximum entropy summary trees to display higher classifications https://doi.org/10.59350/af01t-6sw74 A challenge in working with large taxonomic classifications is how you display them to the user, especially if the user probably doesn't want all the gory details.

Bibliography Of LifePreprintWikiCiteWikidataInformática y Ciencias de la InformaciónInglés
Publicado

Last week I submitted a manuscript entitled "Wikidata and the bibliography of life". I've been thinking about the "bibliography of life" (AKA a database of every taxonomic publication ever published) for a while, and this paper explores the idea that Wikidata is the place to create this database.

BHLBioStorVisualisationInformática y Ciencias de la InformaciónInglés
Publicado

It's funny how some images stick in the mind. A few years ago Chris Freeland (@chrisfreeland), then working for Biodiversity Heritage Library (BHL), created a visualisation of BHL content relevant to the African continent. It's a nice example of small multiples. For more than a decade (gulp) I've been extracting articles from the BHL and storing them in BioStor.

ChallengeDNA BarcodingGBIFInformática y Ciencias de la InformaciónInglés
Publicado

Somewhat stunned by the fact that my DNA barcode browser I described earlier was one of the (minor) prizewinners in this year's GBIF Ebbe Nielsen Challenge. For details on the winner and other place getters see ShinyBIOMOD wins 2020 GBIF Ebbe Nielsen Challenge. Obviously I'm biased, but it's nice to see the challenge inspiring creativity in biodiversity informatics. Congratulations to everyone who took part.