
December last year I proposed the use of microformats and RDFa for simple semantic markup of molecular information.
December last year I proposed the use of microformats and RDFa for simple semantic markup of molecular information.
Over the last few weeks I continued the work on getting (descriptor-based) QSAR/QSPR implemented in Bioclipse. JOELib (GPL) and the CDK (LGPL) being two prominent opensource engines that can calculate molecular descriptors, and AMBIT a front-end. To be able to do QSAR/QSPR model building from start to end in Bioclipse, I worked in April on an architecture for selecting descriptors.
Pedro suggested in Nature Networks What’s Next forum that Nature should add a new service for scientists: hosting electronic lab notebooks. And I think this will be a killer application. I am rather excited about the idea, and feel ashamed not putting one-and-one together myself.
Last week I started the Blue Obelisk Chemical Test File Repository, a repository of OSI-approved-licenced test files (from various sources) to improve interoperability between chemoinformatics software. Following a discussion on the mailing list earlier, a directory hierarchy has been set up, and each files contains an index.xml to describe the content.
Ryan blogged in Archive This about some advices from ACD on how to store spectra in your electronic lab notebook. Use InChI This reminded me of a discussion I had with with Colin when he was at the CUBIC, which was about experimental sections. I proposed that the InChI should have a prominent place in the experimental section.
On July 1st I will start a post-doc in Wageningen, The Netherlands at the WUR. More precisely, with a post-doc in the group of Prof. Van Eeuwijk at Biometris, cooperating with the group of Prof. Hall at Plant Research International (PRI), within the framework of the new Netherlands Metabolomics Center.
Lately, Chemical blogspace has seen an interesting discussion on the quality of opendata and free chemical database (over 32 free resources now ), such as the NMRShiftDB.org. For example, see Antony’s view on the NMRShiftDB and Robien’s analysis. Opendata makes such quality assurance possible, and I am happy that the NMRShiftDB was explored like this; the found problems can be reported and corrected.
Only few people are using InChI’s to indicate the molecules the blog about (prominent exceptions are Useful Chemistry and Molecule of the Day). Consequently, the number of detected molecules (without using OSCAR3) in Chemical blogspace has been low. Fortunately, many more people use links to Wikipedia to identify the molecules that talk about.
Last year the Programmeerzomer.nl sponsored one summer student to work on Bioclipse (see the announcement). The Programmeerzomer is much like the Google Summer of Code where I mentor Alexandr. However, it is much smaller and oriented at just the NL area: both the student and the mentor needs to be Dutch, but the opensource project does not.
While looking up a reference for FirstGlance in Jmol, I found Janocchio, a CDK and Jmol based tool for prediction of coupling constants, recently published in Magnetic Resonance in Chemistry. It’s written by Evans, Bodkin, Baker and Sharman (from Eli Lilly) and licensed LGPL. It is one of those rare contributions of pharmaceutical industry, and I can only deeply appreciate this contribution.
Some 7 years ago, following successes in physics, ChemWeb.com launched the Chemistry Preprint Server (CPS), and Warr evaluated it in a JCIM article three years later. She wrote about ‘lessons learned’, but the only one seemed to have been that chemistry was not ready for it, as the project shutdown in 2004. The archives are still available, fortunately, and you may find it amusing to look up my or some other submission.