Natural SciencesJekyll

Biopragmatics

Unraveling complex biology with biological knowledge graphs. Content licensed under CC BY 4.0.
Home PageAtom FeedMastodon
language
BibliometricsCitationsCitation Networks
Published
Author Charles Tapley Hoyt

OpenCitations aggregates and deduplicates bibliographic information from CrossRef, Europe PubMed Central, and other sources to construct a comprehensive, open index of citations between scientific works. This post describes the opencitations-client package which wraps the OpenCitations API and implements an automated pipeline for locally downloading, caching, and accessing OpenCitations in bulk.

SSSOMSemantic MappingsKnowledge Graphs
Published
Author Charles Tapley Hoyt

There are many challenges associated with the curation, publication, acquisition, and usage of semantic mappings. This post examines their philosophical, technical, and practical implications, highlights existing solutions, and describes opportunities for next steps for the community of curators, semantic engineers, software developers, and data scientists who make and use semantic mappings.

SSSOMSemantic MappingsKnowledge Graphs
Published
Author Charles Tapley Hoyt

Data and knowledge originating from heterogeneous sources often use heterogeneous controlled vocabularies and/or ontologies for annotating named entities. Semantic mappings are essential towards resolving these discrepancies and integrating in a coherent way.

SSSOMSKOSSemantic MappingsMappingsInteroperability
Published
Author Charles Tapley Hoyt

JSKOS (JSON for Knowledge Organization Systems) is a JSON-based data model for representing terminologies, thesauri, classifications, and other semantic artifacts. Like the Simple Standard for Sharing Ontological Mappings (SSSOM), it can also encode semantic mappings. This post is about developing and implementing a crosswalk between them in the sssom-pydantic Python package.

SSSOMWikidataSKOSSemantic MappingsMappings
Published
Author Charles Tapley Hoyt

At the 4th Ontologies4Chem Workshop in Limburg an der Lahn, I proposed an initial crosswalk between the Simple Standard for Sharing Ontological Mappings (SSSOM) and the Wikidata semantic mapping data model. This post describes the motivation for this proposal and the concrete implementation I’ve developed in sssom-pydantic.

LinkMLBioregistryPrefix MapsCURIEsURIs
Published
Author Charles Tapley Hoyt

LinkML enables defining data models and data schemas in YAML informed by semantic web best practices. As such, each definition includes a prefix map. Similarly to my previous posts on validating the prefix maps appearing in Turtle files and in unfamiliar SPARQL endpoints, this post showcases describes a new extension to the Bioregistry that validates prefix maps in LinkML definitions.

Named Entity RecognitionText MiningNatural Language ProcessingNamed Entity NormalizationMedical Subject Headings
Published
Author Charles Tapley Hoyt

Annotating the literature with mentions of key concepts from a given domain is often the first step towards extracting more substantial structured knowledge. This can be challenging, as it typically encompasses acquiring and processing the relevant literature and ontologies then installing and applying difficult-to-use named entity recognition (NER) workflows. This post highlights software components I’ve implemented to simplify this workflow.

LinkMLBioregistryPrefix MapsCURIEsURIs
Published
Author Charles Tapley Hoyt

I recently attended the 4th BioHackathon Germany hosted by the German Network for Bioinformatics Infrastructure (de.NBI). I participated in the project On the Path to Machine-actionable Training Materials in order to improve the interoperability between DALIA, TeSS, mTeSS-X, and Schema.org. This post gives a summary of the activities leading up to the hackathon and the results of our happy hacking.

BioPortalOntoPortalSSSOMNatural Sciences
Published
Author Charles Tapley Hoyt

Earlier this week, a question was asked on OBO Foundry Slack on where to find semantic mappings to terms in the Systematized Nomenclature of Medicine - Clinical Terms (SNOMED-CT). While some are available in the SeMRA Disease Mappings Database, there are many more available within BioPortal, which has access to the entire SNOMED-CT source data and has produced semantic mapping predictions using LOOM.

OntologyOWLGenesHGNCNatural Sciences
Published
Author Charles Tapley Hoyt

This is the second of a two-part post about encoding databases as ontologies. In the first part, I gave a background on how I started working on this problem and the software stack I developed along the way. In this post, I explain the philosophy and design about how I encoded the HGNC (HUGO Gene Nomenclature Committee) database as an ontology using PyOBO.