Rogue Scholar

JSON-LDRDFInformática y Ciencias de la InformaciónInglés

JSON-LD in the wild: examples of how structured data is represented on the web

Publicado 27 de agosto de 2021

I've created a GitHub repository so that I can keep track of the examples of JSON-LD that I've seen being actively used, for example embedded in web sites, or accessed using an API. The repository is https://github.com/rdmpage/wild-json-ld. The list is by no means exhaustive, I hope to add more examples as I come across them. One reason for doing this is to learn what others are doing.

Informática y Ciencias de la InformaciónInglés

Species Cite: linking scientific names to publications and taxonomists

https://doi.org/10.59350/jarsz-yfm45

Publicado 23 de julio de 2021

Autor Roderic Page

I've made Species Cite live. This is a web site I've been working on with the GBIF Challenge as a notional deadline so I'll actually get something out the door. "Species Cite" takes as its inspiration the suggestion that citing original taxonomic descriptions (and subsequent revisions) would increase citation metrics for taxonomists, and give them the credit they deserve.

Bibliography Of LifeCSLElasticSearchJSONJSON-LDInformática y Ciencias de la InformaciónInglés

Towards a WikiCite search engine

https://doi.org/10.59350/45mzm-67867

Publicado 22 de julio de 2021

Autor Roderic Page

I've released a simple search engine for publications in Wikidata. Wikicite Search takes its name from the WikiCite project, which was an initiative to create a bibliographic database in Wikidata. Since bibliographic data is a core component of taxonomic research (arguably taxonomy is mostly tracing the fate of the "tags" we call taxonomic names) I've spent some time getting taxonomic literature into Wikidata.

CitationCSLMachine LearningParsingInformática y Ciencias de la InformaciónInglés

Citation parsing tool released

https://doi.org/10.59350/9416m-mzz03

Publicado 22 de julio de 2021

Autor Roderic Page

Quick note on a tool I've been working on to parse citations, that is to take a series of strings such as: Möllendorff O (1894) On a collection of land-shells from the Samui Islands, Gulf of Siam. Proceedings of the Zoological Society of London, 1894: 146–156. de Morgan J (1885) Mollusques terrestres & fluviatiles du royaume de Pérak et des pays voisins (Presqúile Malaise). Bulletin de la Société Zoologique de France, 10: 353–249.

C++CloudCompilingHerokuInformática y Ciencias de la InformaciónInglés

Compiling a C++ application to run on Heroku

https://doi.org/10.59350/vy6b8-0eh95

Publicado 15 de junio de 2021

Autor Roderic Page

TL;DR Use a buildpack and set "LDFLAGS=--static" --disable-shared I use Heroku to host most of my websites, and since I mostly use PHP for web development this has worked fine. However, every so often I write an app that calls an external program written in, say, C++. Up until now I've had to host these apps on my own web servers. Today I finally bit the bullet and learned how to add a C++ program to a Heroku-hosted site.

ALABHLBioStorGBIFPlaziInformática y Ciencias de la InformaciónInglés

Thoughts on BHL, ALA, GBIF, and Plazi

https://doi.org/10.59350/17w25-9m342

Publicado 4 de junio de 2021

Autor Roderic Page

If you compare the impact that BHL and Plazi have on GBIF, then it's clear that BHL is almost invisible. Plazi has successfully in carved out a niche where they generate tens of thousands of datasets from text mining the taxonomic literature, whereas BHL is a participant in name only. It's not as if BHL lacks geographic data.

CitationCRFIdentifiersMachine LearningSpecimensInformática y Ciencias de la InformaciónInglés

Finding citations of specimens

https://doi.org/10.59350/gg8m4-vb985

Publicado 28 de mayo de 2021

Autor Roderic Page

Note to self. The challenge of finding specimen citations in papers keeps coming around. It seems that this is basically the same problem as finding citations to papers, and can be approached in much the same way. If you want to build a database of reference from scratch, one way is to scrape citations from papers (e.g., from the "literature cited" section), convert those strings into structured data, and add those to your database.

Catalogue Of LifeGraphvizSummary TreesVisualisationInformática y Ciencias de la InformaciónInglés

Maximum entropy summary trees to display higher classifications

https://doi.org/10.59350/af01t-6sw74

Publicado 28 de mayo de 2021

Autor Roderic Page

How to cite: Page, R. (2021). Maximum entropy summary trees to display higher classifications https://doi.org/10.59350/af01t-6sw74 A challenge in working with large taxonomic classifications is how you display them to the user, especially if the user probably doesn't want all the gory details.

Bibliography Of LifePreprintWikiCiteWikidataInformática y Ciencias de la InformaciónInglés

Preprint on Wikidata and the bibliography of life

https://doi.org/10.59350/e8exj-s0j78

Publicado 18 de mayo de 2021

Autor Roderic Page

Last week I submitted a manuscript entitled "Wikidata and the bibliography of life". I've been thinking about the "bibliography of life" (AKA a database of every taxonomic publication ever published) for a while, and this paper explores the idea that Wikidata is the place to create this database.

BHLDNA BarcodingGraphQLHendyLSIDInformática y Ciencias de la InformaciónInglés

It's been a while...

https://doi.org/10.59350/m4my7-7g754

Publicado 6 de abril de 2021

Autor Roderic Page

Is it's been a while since I've blogged here. The last few months have been, um, interesting for so many reasons.

BHLBioStorVisualisationInformática y Ciencias de la InformaciónInglés

Visualising article coverage in the Biodiversity Heritage Library

https://doi.org/10.59350/64rd0-h6t88

Publicado 24 de octubre de 2020

Autor Roderic Page

It's funny how some images stick in the mind. A few years ago Chris Freeland (@chrisfreeland), then working for Biodiversity Heritage Library (BHL), created a visualisation of BHL content relevant to the African continent. It's a nice example of small multiples. For more than a decade (gulp) I've been extracting articles from the BHL and storing them in BioStor.

iPhylo

JSON-LD in the wild: examples of how structured data is represented on the web

Species Cite: linking scientific names to publications and taxonomists

Towards a WikiCite search engine

Citation parsing tool released

Compiling a C++ application to run on Heroku

Thoughts on BHL, ALA, GBIF, and Plazi

Finding citations of specimens

Maximum entropy summary trees to display higher classifications

Preprint on Wikidata and the bibliography of life

It's been a while...

Visualising article coverage in the Biodiversity Heritage Library