Scienze naturaliIngleseJekyll

Biopragmatics

Unraveling complex biology with biological knowledge graphs. Content licensed under CC BY 4.0.
Pagina inizialeAtom ForaggioMastodon
language
Named Entity RecognitionText MiningNatural Language ProcessingNamed Entity NormalizationMedical Subject HeadingsInglese
Pubblicato
Autore Charles Tapley Hoyt

Annotating the literature with mentions of key concepts from a given domain is often the first step towards extracting more substantial structured knowledge. This can be challenging, as it typically encompasses acquiring and processing the relevant literature and ontologies then installing and applying difficult-to-use named entity recognition (NER) workflows. This post highlights software components I’ve implemented to simplify this workflow.

LinkMLBioregistryPrefix MapsCURIEsURIsInglese
Pubblicato
Autore Charles Tapley Hoyt

I recently attended the 4th BioHackathon Germany hosted by the German Network for Bioinformatics Infrastructure (de.NBI). I participated in the project On the Path to Machine-actionable Training Materials in order to improve the interoperability between DALIA, TeSS, mTeSS-X, and Schema.org. This post gives a summary of the activities leading up to the hackathon and the results of our happy hacking.

BioPortalOntoPortalSSSOMScienze naturaliInglese
Pubblicato
Autore Charles Tapley Hoyt

Earlier this week, a question was asked on OBO Foundry Slack on where to find semantic mappings to terms in the Systematized Nomenclature of Medicine - Clinical Terms (SNOMED-CT). While some are available in the SeMRA Disease Mappings Database, there are many more available within BioPortal, which has access to the entire SNOMED-CT source data and has produced semantic mapping predictions using LOOM.

OntologyOWLGenesScienze naturaliInglese
Pubblicato
Autore Charles Tapley Hoyt

This is the first of a two-part post about encoding databases as ontologies. In this post, I give a background on the problems in biocuration that led me to start encoding databases as ontologies, the software I have written to do it, and the repository I have created to store the resulting artifacts in a FAIR, open, and sustainable way.

OntologyOWLGenesHGNCScienze naturaliInglese
Pubblicato
Autore Charles Tapley Hoyt

This is the second of a two-part post about encoding databases as ontologies. In the first part, I gave a background on how I started working on this problem and the software stack I developed along the way. In this post, I explain the philosophy and design about how I encoded the HGNC (HUGO Gene Nomenclature Committee) database as an ontology using PyOBO.

Knowledge GraphsSparqlChemistryCultureCultural HeritageScienze naturaliInglese
Pubblicato
Autore Charles Tapley Hoyt

At the sixth NFDI4Chem consortium meeting, Torsten Schrade from the NFDI4Culture consortium gave a lovely and whimsical talk entitled A Data Alchemist’s Journey through NFDI which explored ways that we might federate and jointly query both consortia’s knowledge via their respective SPARQL endpoints. He proposed a toy example in which he linked paintings depicting alchemists trying to make gold to compounds containing gold.

RORWikidataOrganizationOrganizationsBibliometricsScienze naturaliInglese
Pubblicato
Autore Charles Tapley Hoyt

I was looking at the different NFDI consortia in the Research Organization Registry (ROR), and found that the only two that have a parent relations to the NFDI (ror:05qj6w324) are NFDI4DS (ror:00bb4nn95) and MaRDI (ror:04ncnzm65). This felt strange to me, so I started looking around Wikidata to see if I could automatically make a curation sheet to send along to them.

PackagingPythonToxJustCookiecutter-snekpackScienze naturaliInglese
Pubblicato
Autore Charles Tapley Hoyt

I became aware of just while watching Hynek’s second video on uv a few months ago. I immediately fell in love with its elegance and simplicity, so I have begun replacing task running in my repositories that relied on tox with just. This post gives a bit of background, context, and walks through making the switch on one of my repositories that has some annoying dependencies.

NFDISPARQLBioregistryScienze naturaliInglese
Pubblicato
Autore Charles Tapley Hoyt

Earlier this week at the sixth NFDI4Chem consortium meeting, Torsten Schrade from the NFDI4Culture consortium gave a lovely and whimsical talk entitled A Data Alchemist’s Journey through NFDI which explored ways that we might federate and jointly query both consortia’s knowledge via their respective SPARQL endpoints.

CURIEURIURNIRIIdentifiersScienze naturaliInglese
Pubblicato
Autore Charles Tapley Hoyt

Using standard CURIE prefixes and URI prefixes in semantic web artifacts such as Resource Description Framework (RDF) promotes interoperability, enables reuse in downstream data integration, and makes data more FAIR. The Bioregistry defines a set of standard CURIE prefixes and URI prefixes against which RDF files can be validated/standardized.

ChEMBLCheminformaticsChemoinformaticsChemistryBibliometricsScienze naturaliInglese
Pubblicato
Autore Charles Tapley Hoyt

I’ve recently submitted an article to the Journal of Open Source Software (JOSS) describing chembl-downloader, a Python package for automating downloading and using ChEMBL data in a reproducible way. In this post, I use chembl-downloader to show how the number of compounds, assays, activities, and other entities in ChEMBL have changed over time.