Biyolojik BilimlerİngilizceBlogger

Bio <-> Chem

Technical notes from the interface between bioinformatics and cheminformatics by Chris Southan
Ana SayfaAtom Besleme
language
Biyolojik Bilimlerİngilizce
Yayınlandı
Yazar Christopher Southan

A paper back in the  summer "Deep learning-guided discovery of an antibiotic targeting Acinetobacter baumannii" has generated a lot of media interest (PMID37231267 with Guardian version below)  Context to this post is given below in the brief repartee between the poster and yours truly As a first-stop shop we find a range of

Biyolojik Bilimlerİngilizce
Yayınlandı
Yazar Christopher Southan

Since it was Rare Disease Day recently I was reminded of something I shared with the ELIXIR  Rare Diseases Community some years back that I can expand on. Very few folk were aware of this back then and it has zero social media surfacing (in my bubble at least)  so I suspect it remains under exploited  due not being FAIR (although below does arguably just about fulfil  "F").

Biyolojik Bilimlerİngilizce
Yayınlandı
Yazar Christopher Southan

This post is centered around a series of related graphics. Those with an eye for cultural references may recognise the first one as a lyric from Once_in_a_Lifetime.   I will outline how I got to an h-index of 61 on Google Scholar and into the 1% citation club.  The explanation for the post-2016 citation surge is given below.

Biyolojik Bilimlerİngilizce
Yayınlandı
Yazar Christopher Southan

Update Sept 2025 Welcome; however you got here, including if  you clicked through from our  NAR 2026 Database issue contribution (PMID to be added in due course).  As part of  drafting the following table has been generated from GtoPdb release 2025-2/  (these numbers will increase slightly for the last release before the finalised paper)

Biyolojik Bilimlerİngilizce
Yayınlandı
Yazar Christopher Southan

This post investigates the glaring mismatch between the subsumation of 500,000 CAS substances into PubChem as explicitly declared in PMID35559614 (title below) and the actual indexing results we can interrogate from the interface.  Within PubChem the 500K collapses these to 492K as annotations    These undergo a further collapse to 367K as CIDs in the PubChem Table Of Contents (TOC)

Biyolojik Bilimlerİngilizce
Yayınlandı
Yazar Christopher Southan

Update 07 Aug In response to a recent Twitter query regarding property slice 'n dice you can select NPA as a CID source below  And the much larger Lotus set of  215,384 below via PubChem Table of Contents (then push to Entrez)  The intersect is not too bad (i.e. 88%  of NPA is subsumed by Lotus

Biyolojik Bilimlerİngilizce
Yayınlandı
Yazar Christopher Southan

American Association for Cancer Research (AACR) 2022 disclosures, (Twitter hashtag  #AACR22). So here we go again after the 2021 set. Data curation courtesy of Elena and yours truly. (images below and first listing credit to DrugHunter)  This was a bit more difficult than for ACS First Disclosures.

Biyolojik Bilimlerİngilizce
Yayınlandı
Yazar Christopher Southan

I am pleased to welcome guest blogger and collaborator Elena Faccenda (ORCID 0000-0001-9855-7103).  We had already worked together on the same set from 2021  that was promptly curated into the Guide to Pharmacology.  For this years crop Elena had not only resolved all the codes to SMILES and/or PubChem IDs in short order but also curated the GtoPdb entries in time for Database Release 2022-1 .

Biyolojik Bilimlerİngilizce
Yayınlandı
Yazar Christopher Southan

Update 2 April. Links for these compounds now live in GtoPdb Database Release 2022-1 . Note that activity data (as a prerequisite for curation) is included in these entries.  This pinged my inbox last week so I decided to see what I could resolve and track for these useful DrugHunter compounds in PubChem.

Biyolojik Bilimlerİngilizce
Yayınlandı
Yazar Christopher Southan

Having just made the cover of  Science (PMID 34726479) and being a strong candidate for drug of the year (21 or 22?)  the SARS-Cov-2 antiviral M-protease inhibitor PF-07321332 needs no introduction.  It does however present a very topical name-to-structure (n2s) example to track through various sources.  I have included different profiles from which we can distill aspects of interest.