An anonymous reader reported that the American Medical Association published the structure of dapagliflozin. Here are the details.
An anonymous reader reported that the American Medical Association published the structure of dapagliflozin. Here are the details.
Mike released Operator 0.8, which picks up RDF (RDFa en eRDF) from HTML pages, and adds actions to it. I blogged earlier about the beta and wrote a script for it for chemical RDFa . At this moment, Chemical blogspace and RDF for Molecular Space (see this blog ) are using chemical RDFa to semantically markup molecular information.
Via SciFoo Planet (from Partial immortalization ) I learned about TouchGraph Google (Peter brought it into Chemical blogspace). It’s cool, though not open source.
Peter wondered if data should be stored centralized or decentralized, when Deepak blogged about Freebase and Metaweb. Now, I haven’t really looked into these two projects, but the question of centralized versus decentralized is interesting. It’s MySQL versus the world wide web; it’s the PubChem compound ID versus the InChI; it’s http://cb.openmolecules.net/rdf/?InChI=1/CH4/h1H4 versus info:inchi/InChI=1/CH4/h1H4 (see RDF-ing molecular space ).
Rich blogged about to Never Draw the Same Molecule Twice: Viewing Image Metadata in which he shows his molecular editor outputting images of molecular structure where the connectivity table of structure is embedded in the image. His molecular editor can read the image again, and will automatically pick up the embedded connection table. Noel showed that such can not only be done in Java, but in Python too.
I reported last week about the Molecules in Wikipedia and the plethora of templates used.
I do not care about physical and chemical properties in Wikipedia, as I can easily extract them from other sources. The main value of Wikipedia for molecules is, I think, that it describes the history of a molecule.
Well, no wonder: Excel is meant to be used to process money flows. Anyway, greyarea pointed me to this nice blog item from March 2006. It discusses a 2004 article in BMC Bioinformatics Mistaken Identifiers: Gene name errors can be introduced inadvertently when using Excel in bioinformatics by Barry Zeeberg et al. (DOI:10.1186/1471-2105-5-80). Hence, the importance of semantics and proper markup languages.
Days after the release of OSRA last week, I saw the optical chemistry structure recognition on the front page of my favorite Dutch /. equivalent, Tweakers.net, Duitsers leren computer chemische structuren herkennen, written by René Gerritsen.
RDF might be the solution we are looking for to get a grip on the huge amount of information we are facing. microformats , and RDFa , are just solutions along the way, and Gleaning Resource Descriptions from Dialects of Languages (GRDDL) might be an important tool to get the web RDF-ied.
I had some time to work some more on the QSAR functionality in Bioclipse. There is still much to do, but it is getting there.