Update: the fourth edition is out.
Update: the fourth edition is out.
I cannot find the bug report just now, but the CDK has an open problem with change even notification, where the nonotify classes still caused change event to be sent around. This was because the nonotify classes extended in a wrong way the data classes. So, I worked today on copying the data class implementations into a new implementation, not extending the data classes, while removing the listener code: the silent module.
Kasabi is a new, RDF hosting service by Talis. It’s still in beta, and I have been testing their beta service with the RDF version I created of ChemPedia Substances (the now no longer existing cool web service from MetaMolecular to draw and name organic molecules). Kasabi makes the RDF data available via a few APIs, depending on the APIs selected by the uploader. I picked all five of them, just to see how things work.
Julio and Gert placed their ICCS 2011 work online, and today I was going through old CDs (see From the archives: Chemical Web, and the CDK in 2004 and Chiral Molecules: how cool is the SEM picture?). I also ran into my ICCS 2005 poster, and because that too was before I started blogging, I never posted it online.
Update 2021-02 : this post is still the second-most read post in my blog. Welcome! Some updates: Ammar Ammar in our BiGCaT group has set up a new SPARQL endpoint. Please use and tweet. blog, or otherwise let others now how you use the ChEMBL RDF. Since this post I have blogged a lot more about ChEMBL. Update : this work is now written down in this paper. I’m having a really bad month, as you can see from the number of posts.
Update: the fourth edition is out.
Some time ago, the brilliant GitHub people gave me the following tip. Rajarshi is lazy, and might find it interesting. By appending .patch to the commit URL, a commit can easily be downloaded as patch. That way, developers can easily download it with wget or curl and apply it locally with git am, without having the fetch the full repository.
Later this year the ninth International Conference on Chemical Structures (ICSS) conference will be held in the Netherlands. I had the pleasure of joining this meeting, I think, eight years ago, when I was doing my PhD in Nijmegen. Mind you, I did not attend the conference;
Oscar is a text miner. It mines in text for chemistry. Oscar4 is the next iteration of Oscar code that I worked on in the past three months, with Lezan, Sam, and David.
Mark’s new CCO/RDF hosting functionality (see also my post two days ago) requires RDF/XML format, so I updated my code to convert the Chempedia Substances data into RDF/XML instead of N3 (I have asked Rich to put a new download link online). This is the Groovy code I used: import groovy.xml.MarkupBuilder import groovy.util.IndentPrinter input = new File("substances.json") json = new JsonSlurper().parse(input); def writer = new StringWriter() def
Oscar uses a Maximum Entropy Markov Model (MEMM) based on n-grams. Peter Corbett has written this up (doi:10.1186/1471-2105-9-S11-S4). So, it basically is statistics once more. If you really want a proper bioinformatics education, so do your PhD at a (proteo)chemometrics department. N-grams are word parts of n characters. For example, the trigrams of acetic acid include ace, cid, tic, eti, and aci.