Rogue Scholar

Published October 29, 2024 in chem-bla-ics

Open Science doesn’t make publishing easier. That that’s all for the better: our research efforts are complex, so why should the publishing be. Sure, I am not talking about references formatting or moving the Methods section to the right location, or some silly statement that all authors agree with the manuscript when you are the only author. No, let’s talk about data. What should you publish? How, and when?

References

The ChEMBL database as linked open data

https://doi.org/10.1186/1758-2946-5-23

Published May 8, 2013 in Journal of Cheminformatics

Authors Egon L Willighagen, Andra Waagmeester, Ola Spjuth, Peter Ansell, Antony J Williams, Valery Tkachenko, Janna Hastings, Bin Chen, David J Wild

PSnpBind: a database of mutated binding site protein–ligand complexes constructed using a multithreaded virtual screening workflow

https://doi.org/10.1186/s13321-021-00573-5

Published February 28, 2022 in Journal of Cheminformatics

Authors Ammar Ammar, Rachel Cavill, Chris Evelo, Egon Willighagen

AbstractA key concept in drug design is how natural variants, especially the ones occurring in the binding site of drug targets, affect the inter-individual drug response and efficacy by altering binding affinity. These effects have been studied on very limited and small datasets while, ideally, a large dataset of binding affinity changes due to binding site single-nucleotide polymorphisms (SNPs) is needed for evaluation. However, to the best of our knowledge, such a dataset does not exist. Thus, a reference dataset of ligands binding affinities to proteins with all their reported binding sites’ variants was constructed using a molecular docking approach. Having a large database of protein–ligand complexes covering a wide range of binding pocket mutations and a large small molecules’ landscape is of great importance for several types of studies. For example, developing machine learning algorithms to predict protein–ligand affinity or a SNP effect on it requires an extensive amount of data. In this work, we present PSnpBind: A large database of 0.6 million mutated binding site protein–ligand complexes constructed using a multithreaded virtual screening workflow. It provides a web interface to explore and visualize the protein–ligand complexes and a REST API to programmatically access the different aspects of the database contents. PSnpBind is open source and freely available at https://psnpbind.org.

pathwaybiology

GPML files for Homo sapiens pathways

https://doi.org/10.5281/zenodo.13933046

Published October 10, 2024

Author WikiPathways

Monthly data release from WikiPathways.org

Cheminformatics and quantitative structure-activity relationshipsBioinformatics and computational biology not elsewhere classified

Metabolite BridgeDb ID Mapping Database (20240903)

https://doi.org/10.6084/m9.figshare.26931712.v1

Published January 1, 2024

Author Egon Willighagen

BridgeDb ID mapping database for metabolites, using HMDB 5.0 (Release of April 2024), ChEBI 236 (Release of August 2024), and Wikidata (3 September 2024) as data sources.

Organic ChemistryFOS: Chemical sciencesCheminformatics

ChemPedia as RDF

https://doi.org/10.6084/m9.figshare.681678

Published January 1, 2013

Author Egon Willighagen

ChemPedia was a project by Rich Apodaca for crowd sourcing names of chemical compounds. Users were able to contribute names and up- or downvote names. When the service was discontinued, Rich released the data under the CCZero license.

100708 NanomaterialsFOS: Nanotechnology100713 Nanotoxicology, Health and Safety30304 Physical Chemistry of MaterialsFOS: Chemical sciences

NanoWiki 5

https://doi.org/10.6084/m9.figshare.7075214.v1

Published January 1, 2018

Author Egon Willighagen

New release with more JRC nanomaterials annotated with ENM ontology terms, and data from a 2013 NanoQSAR study from Small by Lin et al. on 24 metal oxides.

BiochemistryPharmacology39999 Chemical Sciences not elsewhere classifiedFOS: Chemical sciencesSociology

MOESM1 of PubChemRDF: towards the semantic annotation of PubChem compound and substance databases

https://doi.org/10.6084/m9.figshare.c.3696370_d1.v1

Published January 1, 2015

Authors Gang Fu, Colin Batchelor, Michel Dumontier, Janna Hastings, Egon Willighagen, Evan Bolton

Additional file 1. The supporting information for the paper entitled: PubChemRDF: towards the semantic annotation of PubChem compound and substance databases.

Space ScienceMolecular BiologyPhysiologyFOS: Biological sciencesPharmacology

MOESM1 of XMetDB: an open access database for xenobiotic metabolism

https://doi.org/10.6084/m9.figshare.c.3698536_d1.v1

Published January 1, 2016

Authors Ola Spjuth, Patrik Rydberg, Egon Willighagen, Chris Evelo, Nina Jeliazkova

Additional file 1. Example of exported SDF file from XMetDB.

Additional files, data, datasets, databases, and published data

References

The ChEMBL database as linked open data

PSnpBind: a database of mutated binding site protein–ligand complexes constructed using a multithreaded virtual screening workflow

GPML files for Homo sapiens pathways

Metabolite BridgeDb ID Mapping Database (20240903)

ChemPedia as RDF

NanoWiki 5

MOESM1 of PubChemRDF: towards the semantic annotation of PubChem compound and substance databases

MOESM1 of XMetDB: an open access database for xenobiotic metabolism