Ciencias QuímicasInglésHugo

Depth-First

Depth-First
Recent content on Depth-First
Página de inicio
language
Ciencias QuímicasInglés
Publicado

What set of features uniquely characterize a given molecule? What modes of representation should be fixed or rejected, and under what conditions? Given that machine-based molecular encodings have been in use for more than sixty years, it might seem that such questions have long since been resolved. Nevertheless, the topic casts a long shadow to this day, particularly when dataset curation or property prediction come into play.

Ciencias QuímicasInglés
Publicado

Chemical data analysis pipelines often start with reading SDfiles, the field’s de facto standard for information exchange. Given the growing size of many chemistry data sets, efficient methods for reading SDfiles have become ever more important. As part of a continuing series on Rust for Cheminformatics, this article takes a hands-on first look at reading arbitrarily large SDfiles in Rust.

Ciencias QuímicasInglés
Publicado

Chemical datasets often need to be exchanged with high fidelity. A number of file formats enabling such exchange can be found in the wild but the most common by far is the structure-data file (SDfile, aka “SD File,” “SD file,” or “SDF”). Although the format appears simple on the surface, there are some subtleties to consider. This article takes a closer look at the de facto standard for information exchange in cheminformatics.

Ciencias QuímicasInglés
Publicado

Like most successful duos, Rust and WebAssembly (Wasm) complement each other. Rust is a typesafe systems language with modern tooling and high-level features. WebAssembly is a portable compilation target/execution environment for the Web browser and beyond. The combination makes it possible to write fast, stable software that runs anywhere without recompilation. But there’s a catch.

Ciencias QuímicasInglés
Publicado

Rust iterators are fundamental to the language and can be found in a variety of contexts. Consuming iterators returned from functions in the standard library and crates is straightforward. Eventually, however, you’ll want to return iterators from your own functions. This article discusses the major approaches to this surprisingly complex problem. It’s based in part on answers to this question.

Ciencias QuímicasInglés
Publicado

Python is the most popular orchestration language in scientific computing. Across a variety of fields, Python provides high-level interfaces to fast code written in other languages. A previous article introduced ChemCore, a new cheminformatics library written in Rust. This article moves the idea another step forward by introducing OxMol, Python bindings for ChemCore. Installation The OxMol documentation describes two installation methods.

Ciencias QuímicasInglés
Publicado

SMILES is the most widely-used line notation in cheminformatics, and one of two standard information exchange formats. Like Molfile, SMILES supports hydrogen suppression, a method for representing monovalent hydrogens and associated bonds without explicitly encoding them within the molecular graph. But getting that system to work well requires clear, readily-available documentation. This article attempts to solve that problem.

Ciencias QuímicasInglés
Publicado

SMILES is a widely-used language for chemical structure exchange. As such, no cheminformatics toolkit today would be complete without a SMILES reader and writer. This article describes the design and initial implementation of Purr, a toolkit-agnostic library for working with SMILES in Rust. In its current form, Purr can parse most of the SMILES language. Previous articles from this blog may be helpful in understanding Purr’s purpose and design.

Ciencias QuímicasInglés
Publicado

Hydrogen bears the distinction of being both the most common element in the universe and one of the most predicable elements on the periodic table. Outnumbering carbon in many organic molecules, hydrogen rarely participates in anything more exciting than hydrogen bonding or acid-base reactions. For the most part, the monovalent hydrogen atoms studding the average organic molecule are ignored.