Rogue Scholar

Ciencias QuímicasInglés

A Guide to Molecular Standardization

Publicado 27 de julio de 2020

What set of features uniquely characterize a given molecule? What modes of representation should be fixed or rejected, and under what conditions? Given that machine-based molecular encodings have been in use for more than sixty years, it might seem that such questions have long since been resolved. Nevertheless, the topic casts a long shadow to this day, particularly when dataset curation or property prediction come into play.

Ciencias QuímicasInglés

Reading Large SDfiles in Rust

https://doi.org/10.59350/aw06k-5fy25

Publicado 22 de julio de 2020

Autor Richard L. Apodaca

Chemical data analysis pipelines often start with reading SDfiles, the field’s de facto standard for information exchange. Given the growing size of many chemistry data sets, efficient methods for reading SDfiles have become ever more important. As part of a continuing series on Rust for Cheminformatics, this article takes a hands-on first look at reading arbitrarily large SDfiles in Rust.

Ciencias QuímicasInglés

The SDfile Format

https://doi.org/10.59350/1xbpa-y6313

Publicado 13 de julio de 2020

Autor Richard L. Apodaca

Chemical datasets often need to be exchanged with high fidelity. A number of file formats enabling such exchange can be found in the wild but the most common by far is the structure-data file (SDfile, aka “SD File,” “SD file,” or “SDF”). Although the format appears simple on the surface, there are some subtleties to consider. This article takes a closer look at the de facto standard for information exchange in cheminformatics.

Ciencias QuímicasInglés

Rust and WebAssembly from Scratch: Hello World with Strings

https://doi.org/10.59350/4w3gw-mfr52

Publicado 7 de julio de 2020

Autor Richard L. Apodaca

Like most successful duos, Rust and WebAssembly (Wasm) complement each other. Rust is a typesafe systems language with modern tooling and high-level features. WebAssembly is a portable compilation target/execution environment for the Web browser and beyond. The combination makes it possible to write fast, stable software that runs anywhere without recompilation. But there’s a catch.

Ciencias QuímicasInglés

Compiling Rust to WebAssembly: A Simple Example

https://doi.org/10.59350/gvens-nq331

Publicado 29 de junio de 2020

Autor Richard L. Apodaca

Rust and WebAssembly (Wasm) are often discussed together. The former is a typesafe systems programming language with modern tooling and many high-level features. The latter is a portable, secure execution environment that runs inside and outside the browser. The combination promises many years of progress and utility.

Ciencias QuímicasInglés

Returning Rust Iterators

https://doi.org/10.59350/5dfa6-gem63

Publicado 22 de junio de 2020

Autor Richard L. Apodaca

Rust iterators are fundamental to the language and can be found in a variety of contexts. Consuming iterators returned from functions in the standard library and crates is straightforward. Eventually, however, you’ll want to return iterators from your own functions. This article discusses the major approaches to this surprisingly complex problem. It’s based in part on answers to this question.

Ciencias QuímicasInglés

OxMol: Rust/Python Bindings for ChemCore

https://doi.org/10.59350/djtgr-6jz25

Publicado 15 de junio de 2020

Autor Richard L. Apodaca

Python is the most popular orchestration language in scientific computing. Across a variety of fields, Python provides high-level interfaces to fast code written in other languages. A previous article introduced ChemCore, a new cheminformatics library written in Rust. This article moves the idea another step forward by introducing OxMol, Python bindings for ChemCore. Installation The OxMol documentation describes two installation methods.

Ciencias QuímicasInglés

Hydrogen Suppression in SMILES

https://doi.org/10.59350/sr0mv-gjz70

Publicado 8 de junio de 2020

Autor Richard L. Apodaca

SMILES is the most widely-used line notation in cheminformatics, and one of two standard information exchange formats. Like Molfile, SMILES supports hydrogen suppression, a method for representing monovalent hydrogens and associated bonds without explicitly encoding them within the molecular graph. But getting that system to work well requires clear, readily-available documentation. This article attempts to solve that problem.

Ciencias QuímicasInglés

ChemCore: A Cheminformatics Toolkit for Rust

https://doi.org/10.59350/e1bsd-5yx98

Publicado 1 de junio de 2020

Autor Richard L. Apodaca

Chemistry imposes formidable requirements on application developers. One of the toughest is the manipulation of chemical structures as first-class data structures. General purpose programming language don’t fulfil this requirement, so the responsibility falls to a peculiar layer of software.

Ciencias QuímicasInglés

Let's Build a SMILES Parser in Rust

https://doi.org/10.59350/wbs0x-ezz89

Publicado 25 de mayo de 2020

Autor Richard L. Apodaca

SMILES is a widely-used language for chemical structure exchange. As such, no cheminformatics toolkit today would be complete without a SMILES reader and writer. This article describes the design and initial implementation of Purr, a toolkit-agnostic library for working with SMILES in Rust. In its current form, Purr can parse most of the SMILES language. Previous articles from this blog may be helpful in understanding Purr’s purpose and design.

Ciencias QuímicasInglés

Hydrogen Suppression in Cheminformatics

https://doi.org/10.59350/c543k-c0d76

Publicado 18 de mayo de 2020

Autor Richard L. Apodaca

Hydrogen bears the distinction of being both the most common element in the universe and one of the most predicable elements on the periodic table. Outnumbering carbon in many organic molecules, hydrogen rarely participates in anything more exciting than hydrogen bonding or acid-base reactions. For the most part, the monovalent hydrogen atoms studding the average organic molecule are ignored.

Depth-First

A Guide to Molecular Standardization

Reading Large SDfiles in Rust

The SDfile Format

Rust and WebAssembly from Scratch: Hello World with Strings

Compiling Rust to WebAssembly: A Simple Example

Returning Rust Iterators

OxMol: Rust/Python Bindings for ChemCore

Hydrogen Suppression in SMILES

ChemCore: A Cheminformatics Toolkit for Rust

Let's Build a SMILES Parser in Rust

Hydrogen Suppression in Cheminformatics