ChimicaIngleseJekyll

chem-bla-ics

chem-bla-ics
Chemblaics (pronounced chem-bla-ics) is the science that uses open science and computers to solve problems in chemistry, biochemistry and related fields.
Pagina inizialeJSON Foraggio
language
GoogleCiteulikeChimicaInglese
Pubblicato

Web of Science is my de facto standard for citation statistics (I need these for VR grant applications), and defines the lower limit of citations (it is pretty clean, but I do have to ping them now and then to fix something). The public front-end of it is Researcher ID. There is an Microsoft initiative, which looks clean but doesn’t work on Linux for the nicer things, but the coverage of journals is pretty bad in my field, giving a biased

CdkChimicaInglese
Pubblicato

I cannot find the bug report just now, but the CDK has an open problem with change even notification, where the nonotify classes still caused change event to be sent around. This was because the nonotify classes extended in a wrong way the data classes. So, I worked today on copying the data class implementations into a new implementation, not extending the data classes, while removing the listener code: the silent module.

SemwebKasabiChemistryChimicaInglese
Pubblicato

Kasabi is a new, RDF hosting service by Talis. It’s still in beta, and I have been testing their beta service with the RDF version I created of ChemPedia Substances (the now no longer existing cool web service from MetaMolecular to draw and name organic molecules). Kasabi makes the RDF data available via a few APIs, depending on the APIs selected by the uploader. I picked all five of them, just to see how things work.

ChemblRdfChimicaInglese
Pubblicato

Update 2021-02 : this post is still the second-most read post in my blog. Welcome! Some updates: Ammar Ammar in our BiGCaT group has set up a new SPARQL endpoint. Please use and tweet. blog, or otherwise let others now how you use the ChEMBL RDF. Since this post I have blogged a lot more about ChEMBL. Update : this work is now written down in this paper.

GithubChimicaInglese
Pubblicato

Some time ago, the brilliant GitHub people gave me the following tip. Rajarshi is lazy, and might find it interesting. By appending .patch to the commit URL, a commit can easily be downloaded as patch. That way, developers can easily download it with wget or curl and apply it locally with git am, without having the fetch the full repository.

GroovyChemistryRdfJsonChimicaInglese
Pubblicato

Mark’s new CCO/RDF hosting functionality (see also my post two days ago) requires RDF/XML format, so I updated my code to convert the Chempedia Substances data into RDF/XML instead of N3 (I have asked Rich to put a new download link online). This is the Groovy code I used: import groovy.xml.MarkupBuilder import groovy.util.IndentPrinter input = new File("substances.json") json = new JsonSlurper().parse(input); def writer = new StringWriter() def

OscarTextminingChimicaInglese
Pubblicato

Oscar uses a Maximum Entropy Markov Model (MEMM) based on n-grams. Peter Corbett has written this up (doi:10.1186/1471-2105-9-S11-S4). So, it basically is statistics once more. If you really want a proper bioinformatics education, so do your PhD at a (proteo)chemometrics department. N-grams are word parts of n characters. For example, the trigrams of acetic acid include ace, cid, tic, eti, and aci.

OscarChemicaltaggerBeilsteinChimicaInglese
Pubblicato

The two earlier posts in this series showed screenshots of results of Oscar, but the title also promised results by Lezan’s ChemicalTagger. Sam helped with getting the HTML pages online via the Cambridge Hudson installation. Where Oscar find named entities (chemical compounds, processes, etc), ChemicalTagger finds roles, like solvent, acid, base, catalyst. Roles are properties of chemical compounds in certain situations.