The International Society of Biocuration (ISB) partners with the journal Database to get discounts for its members when they publish there. This means the ISB’s executive committee needs to send a member list to the journal’s editor.
The International Society of Biocuration (ISB) partners with the journal Database to get discounts for its members when they publish there. This means the ISB’s executive committee needs to send a member list to the journal’s editor.
The Open Researcher and Contributor Identifier (ORCID) database is an invaluable resource that supports the unambiguous identification of researchers. However, its first party data dump is too complex, verbose, and unstandardized for many use cases. This post describes open source software I wrote that automates downloading, processing, and exporting ORCID into a more usable form. I put the results on Zenodo under the CC0 license.

I’ve just returned from the 17 th Annual International Biocuration Conference at the Indian Biological Data Centre (IBDC) in Faridabad, India.

Using Pydantic for encoding data models and FastAPI for implementing APIs on top of them has become a staple for many Python programmers.
I finally got back into reading! Over winter break 2022, I started the Stormlight Archive then followed up in 2023 by reading the entirety of Brandon Sanderson’s Cosmere , as well as a some other fantasy, science fiction, and literary fiction. Here’s the list.

The Unified Medical Language System (UMLS) is a widely used biomedical and clinical vocabulary maintained by the United States National Library of Medicine. However, it is notoriously difficult to access and work with due to licensing restrictions and its complex download system. In the same vein as my previous posts about DrugBank and ChEMBL, this post describes open source software I’ve developed for downloading and working with this data.
I’ve been working on improving reproducibility in the field of cheminformatics for some time now. For example, I’ve written posts about making data from DrugBank and ChEMBL more actionable. Over the last year, I’ve been preparing a concept with the editors of the Journal of Cheminformatics on how to include an assessment of reproducibility to reviews of manuscripts submitted to the journal.
Today’s short post is about three SPARQL queries I wrote to get bibliometric information about journals and publishers out of Wikidata. Each of the following queries can be readily copy-pasted into the Wikidata Query Service and run in the browser.

I was recently nominated for the International Society for Biocuration’s Excellence in Biocuration Early Career Award (results will be announced on June 14 th !). This made me curious about how to model nominations and awards on Wikidata. In this post, I’ll describe how to curate awards, nominations, recipients, and how to make SPARQL queries to get them.
Archival Resource Keys (ARKs) are flavor of persistent identifiers like DOIs, URNs, and Handles that have the benefit of being free, flexible with what metadata gets attached, and natively able to resolve to web pages. Name-to-Thing (N2T) implements a resolver for a variety of ARKs, so this blog post is about how that resolver can be re-implemented with the curies Python package. In a lot of ways, ARKs look and act like CURIEs.

This blog is normally about very serious science , but I’m taking a break from that for the evening to advertize my band’s upcoming show on April 8 th in the SPH Music Masters Finale (aka, the German Battle of the Bands). We need your support!