SWAT4LS was once again a great meeting.
SWAT4LS was once again a great meeting.
Doing searches in RDF stores is commonly done with SPARQL queries. I have been using this with the semantic web translation of WikiPathways by Andra to find common content issues, though sometimes combined with some additional Java code. For example, find PubMed identifiers that are not numbers.
Andra Waagmeester published a paper on his work on a semantic web version of the WikiPathways (doi:10.1371/journal.pcbi.1004989). The paper outlines the design decisions, shows the SPARQL endpoint, and several examples SPARQL queries. These include federates queries, like a mashup with DisGeNET (doi:10.1093/database/bav028) and EMBL-EBI’s Expression Atlas.
In 2010 Samuel Lampa and I started a pet project: collecting pKa data: he was working on RDF extension of MediaWiki and I like consuming RDF data. We started DrugMet. When you read this post, this MediaWiki installation may already be down, which is why I am migrating the data to Wikidata. Why?
Last week the huge, bi-annual ACS meeting took place (#ACSSanDiego), during which commonly new drug (leads) are disclosed.
Adding chemical compounds to Wikidata is not difficult. You can store the chemical formula (P274), (canonical) SMILES (P233), InChIKey (P235) (and InChI (P234), of course), as well various database identifiers (see what I wrote about that here ]). It also allows storing of the provenance, and has predicates for that too.
Source: Wikipedia. CC-BY-SA April this year I blogged about an important SPARQL query for many chemists: getting CAS registry numbers from Wikidata.
There are many fancy tools to edit ontologies. I like simple editors, like nano. And like any hacker, I can hack OWL ontologies in nano. The hacking implies OWL was never meant to be hacked on a simple text editor; I am not sure that is really true. Anyways, HTML5 and RDFa will do fine, and here is a brief write up. This post will not cover the basics of RDFa and does assume you already know how triples work. If not, read this RDFa primer first.
Gang Fu and Evan Bolton have blogged about it previously, but their PubChemRDF paper is out now (doi:10.1186/s13321-015-0084-4). It very likely defines the largest collection of RDF triples using the CHEMINF ontology and I congratulate the authors with a increasingly powerful PubChem database.
If you are a scientist you have heard about the ORCID identifier by now. If not, you have been focusing on groundbreaking research and isolated yourself from the rest of the world, just to make it perfect and get that Nobel prize next year.
I have promised my Twitter followers the SPARQL query you have all been waiting for. Sadly, you had to wait for it for more than two months. I’m sorry about that.