Kimya BilimleriİngilizceJekyll

chem-bla-ics

chem-bla-ics
Chemblaics (pronounced chem-bla-ics) is the science that uses open science and computers to solve problems in chemistry, biochemistry and related fields.
Ana SayfaJSON Besleme
language
GroovyChemistryRdfJsonKimya Bilimleriİngilizce
Yayınlandı

Mark’s new CCO/RDF hosting functionality (see also my post two days ago) requires RDF/XML format, so I updated my code to convert the Chempedia Substances data into RDF/XML instead of N3 (I have asked Rich to put a new download link online). This is the Groovy code I used: import groovy.xml.MarkupBuilder import groovy.util.IndentPrinter input = new File("substances.json") json = new JsonSlurper().parse(input); def writer = new StringWriter() def

OscarTextminingKimya Bilimleriİngilizce
Yayınlandı

Oscar uses a Maximum Entropy Markov Model (MEMM) based on n-grams. Peter Corbett has written this up (doi:10.1186/1471-2105-9-S11-S4). So, it basically is statistics once more. If you really want a proper bioinformatics education, so do your PhD at a (proteo)chemometrics department. N-grams are word parts of n characters. For example, the trigrams of acetic acid include ace, cid, tic, eti, and aci.

OscarChemicaltaggerBeilsteinKimya Bilimleriİngilizce
Yayınlandı

The two earlier posts in this series showed screenshots of results of Oscar, but the title also promised results by Lezan’s ChemicalTagger. Sam helped with getting the HTML pages online via the Cambridge Hudson installation. Where Oscar find named entities (chemical compounds, processes, etc), ChemicalTagger finds roles, like solvent, acid, base, catalyst. Roles are properties of chemical compounds in certain situations.

OscarJavaKimya Bilimleriİngilizce
Yayınlandı

Say, you have your own dictionary of chemical compounds. For example, like your company’s list of yet-unpublished internal research codes. Still, you want to index your local listserv to make it easier for your employees to search for particular chemistry you are working on and perhaps related to something done at other company sites. This is what Oscar is for. But, it will need to understand things like UK-92,480.

OscarTextminingBeilsteinKimya Bilimleriİngilizce
Yayınlandı

One goal of my three month project is to take Oscar4 to the community. We want to get it used more, and we need a larger development community. Oscar4 and the related technologies do a good, sometimes excellent, job, but have to be maintained, just like any other piece of code. To make using it easier, we are developing new APIs, as well as two user-oriented applications: a Taverna 2 plugin , and command line utilities.

CitoCiteulikeCdkWordleKimya Bilimleriİngilizce
Yayınlandı

Last month I reported a few things I missed in CiteULike. One of them was support for CiTO (see doi:10.1186/2041-1480-1-S1-S6), a great Citation Typing Ontology. I promised the CiTO author, David, my use cases, but have been horribly busy in the past few weeks with my new position, wrapping up my past position, and thinking on my position after Cambridge.

OscarJavaChebiKimya Bilimleriİngilizce
Yayınlandı

Besides getting Oscar used by ChEBI (hopefully via Taverna ), my main task in my three month Oscar project is to refactor things to make it more modular, and remove some features no longer needed (e.g. an automatically created workspace environment). Clearly, I need to define a lot of new unit tests to ensure my assumptions on how to code works are valid. So, what are the API requirements set out?

OscarTextminingChebiKimya Bilimleriİngilizce
Yayınlandı

As Peter announced in his blog, and I tweeted earlier, I have started as postdoctoral research associate in Peter’s group at the University of Cambridge, to work the next three months on Oscar, a chemical text mining tool. My tasks will focus on programmatical plumbing instead of method development, and I am aiming at integration with CDK-Taverna (see doi:10.1186/1471-2105-11-159, and which is currently being ported to Taverna 2.2 by Andreas).