I do not care about physical and chemical properties in Wikipedia, as I can easily extract them from other sources. The main value of Wikipedia for molecules is, I think, that it describes the history of a molecule.
I do not care about physical and chemical properties in Wikipedia, as I can easily extract them from other sources. The main value of Wikipedia for molecules is, I think, that it describes the history of a molecule.
Well, no wonder: Excel is meant to be used to process money flows. Anyway, greyarea pointed me to this nice blog item from March 2006. It discusses a 2004 article in BMC Bioinformatics Mistaken Identifiers: Gene name errors can be introduced inadvertently when using Excel in bioinformatics by Barry Zeeberg et al. (DOI:10.1186/1471-2105-5-80). Hence, the importance of semantics and proper markup languages.
RDF might be the solution we are looking for to get a grip on the huge amount of information we are facing. microformats, and RDFa, are just solutions along the way, and Gleaning Resource Descriptions from Dialects of Languages (GRDDL) might be an important tool to get the web RDF-ied. One important aspect of RDF is that any resource has a unique URI.
I had some time to work some more on the QSAR functionality in Bioclipse. There is still much to do, but it is getting there. The calculation of a QSAR descriptor data matrix This screenshot shows that multi-resource selection is now working, and that the calculation is now a Job.
Igor wrote a message to the CCL mailing list about OSRA: The email does not give any information on the fail rate, but the demo they provide via the webinterface does show some minor glitches (the bromine is not recognized): The source reuses OpenBabel and uses the GPL license. The value equal to that of text mining tools like OSCAR3 , and together they sounds like the Jordan and Pippen of mining chemical literature.
Deepak blogged about screencasting for bio topics, concentrated at bioscreencast.com of which he is co-owner. I guess it is like a YouTube for bioinformatics thingies. Jean-Claude picked this up very quickly (seen on Cb? At least I did.), and already uploaded a screencast, demoing JSpecView written by Robert. I wonder if he will upload the screencasts he made for Bioclipse too?
The Chemistry Development Kit has a rich set of data classes, each of which is defined by an interface. While the classes for atoms, bonds and a connectivity table are fairly straightforward, but beyond that it is sometimes not entirely clear. I will now discuss all interfaces in a series of blog items. I’ll start with the IChemFile. Christoph, please correct me if I move to far away from our Notre Dame board sketch.
So, with all these people blogging about the Open Science Notebook (yes, each word is one distinct blog) it is worth looking back in time. To make clear what I put under the OSN: a notebook in which experimental details and outcome are written down. So, what did the OSN look like almost ten years ago?
Second in a series of articles summarizing articles that cite one of the main CDK articles for CDK News. The first CDK Literature was already half a year ago, so it was about time. Bioclipse Nothing much I have to say about that. Just browse my blog and you’ll see that it heavily uses CDK, JChemPaint and Jmol. See also the Bioclipse blog.
Chemical blogspace has seen a lengthy discussion on the quality of a few NMR shift prediction programs, and Ryan wanted to make a final statement. Down his blog item he had this quote from Jeff, discussing the use of the NMRShiftDB as external test set: I’m sure none of us knows what weird chemistry people are doing;
Everyone of use knows that big pile of paper on your desk that contains the things we want to read, scan or just browse. I even have an electronic equivalent. Another pile contains leaflets and glossy folders from conferences, like the ACS meeting in Chicago. OK, going to get rid of those last ones, and will shortly put the links here. The first leaflet is from Chemistry Central, one of the open access publishers.