Informatique et sciences de l'informationAnglaisBlogger

iPhylo

Rants, raves (and occasionally considered opinions) on phyloinformatics, taxonomy, and biodiversity informatics. For more ranty and less considered opinions, see my Twitter feed.ISSN 2051-8188. Written content on this site is licensed under a Creative Commons Attribution 4.0 International license.
Page d'accueilFlux AtomMastodonISSN 2051-8188
language
Informatique et sciences de l'informationAnglais
Publié

For a while now I’ve been exploring ways to navigate through DNA barcodes. Over the years I’ve built various “toys” to explore barcodes, such as Displaying a million DNA barcodes on Google Maps using CouchDB, built a small scale browser using Elastic search that had some succes, and discovered that Postgres can search for DNA sequences and it’s really fast.

Informatique et sciences de l'informationAnglais
Publié

How to cite: Page, R. (2024). Internet Archive as a single point of failure https://doi.org/10.59350/1r3m1-c5e22 Just a placeholder to mark the ongoing impact of the Internet Archive being attacked (see here, here and here for details). The impact of this on the Biodiversity Heritage Library (BHL) has been huge, and reveals the extent to which BHL depends on the Archive.

Informatique et sciences de l'informationAnglais
Publié

How to cite: Page, R. (2024). Exploring BOLD's DNA barcode data releases: there's a fraction too much friction https://doi.org/10.59350/6qepn-ge510 Recently I’ve been exploring data downloaded from BOLD. Part of this was motivated by work done with David Schindel for a recent book: In this blog post I record some struggles I’ve had with the supposedly “Frictionless” data provided by BOLD.

Informatique et sciences de l'informationAnglais
Publié

How to cite: Page, R. (2024). The Data Citation Corpus revisited https://doi.org/10.59350/wvwva-v7125 TL;DR These are some brief notes on the latest version (v. 2) of the Data Citation Corpus, relased shortly before the Make Data Count Summit 2024, which also included a discussion on the practical uses of the corpus. I downloaded version 2 from Zenodo doi:10.5281/zenodo.13376773.

Informatique et sciences de l'informationAnglais
Publié

How to cite: Page, R. (2024). Why do museum and gallery displays ignore the web? https://doi.org/10.59350/a83tn-c6t14 This post is inspired by the Pharaoh exhibition at the NGV in Melbourne, Australia. This is a beautifully displayed exhibition of objects from the British Museum, London. It has all the trappings of a modern exhibition, beautiful lighting, a custom sound track, and lots of social media coverage.

Informatique et sciences de l'informationAnglais
Publié

How to cite: Page, R. (2024). A future for the Biodiversity Heritage Library https://doi.org/10.59350/n3dkt-6xd05 Following the 2024 BHL meeting, and the departure of Martin Kalfatovic and the uncertainty the departure of such a pivitol person brings, perhaps it’s time to think about the future of BHL. Below I sketch some thoughts, which are hazy at best. I should say at the outset that I think BHL is an extraordinary project.

Informatique et sciences de l'informationAnglais
Publié

How to cite: Page, R. (2024). Visualising big trees: a talk at the Systematics Association 2024 https://doi.org/10.59350/cf6n4-ch767 This blog post has some notes in support of a talk given to the Systematics Association meeting in Reading June 20th, 2024. Slides I will post a link to the slides here once I have given the talk. Page, Roderic (2024). Visualising big trees. figshare. Presentation.

FAIRIdentifiersNanopublicationPensoftRDFInformatique et sciences de l'informationAnglais
Publié

How to cite: Page, R. (2024). Nanopubs, a way to create even more silos https://doi.org/10.59350/6nj85-7te92 Pensoft have recently introduced “nanopubs”, small structured publications that can be thought of as containing the minimum possible statement that could be published. Nanopubs are promoted as FAIR, that is findable, accessible, interoperabile, and reusable.

Informatique et sciences de l'informationAnglais
Publié

How to cite: Page, R. (2024). Notes on transforming BHL images https://doi.org/10.59350/2gpbb-98a53 I’ve been down this road before, e.g. BHL, DjVu, and reading the f*cking manual and Demo of full-text indexing of BHL using CouchDB hosted by Cloudant, but I’m revisiting converting BHL page scans to black and white images, partly to clean them up, to make them closer to what a modern reader might expect, and partly to reduce the

Informatique et sciences de l'informationAnglais
Publié

How to cite: Page, R. (2024). Hugging Face Autotrain https://doi.org/10.59350/7p1n4-wdv84 These are notes to myself on using Hugging Face AutoTrain. The first version of this had a very nice interface where you could simply upload a folder of images and train a model. It was limited in the range of tasks and models, but made up for that in ease of use.

Informatique et sciences de l'informationAnglais
Publié

How to cite: Page, R. (2024). Problems with the DataCite Data Citation Corpus https://doi.org/10.59350/t80g1-xys37 DataCite have released the Data Citation Corpus, together with a dashboard that summarises the corpus. This is billed as: The goal is to build a citation database between scholarly articles and data, such as datasets in repositories, sequences in GenBank, protein structures in PDB, etc.