BiologieEnglischSubstack

Paired Ends

Bioinformatics, computational biology, and data science updates from the field. Occasional posts on programming.
StartseiteRSS-FeedMastodon
language
PapersBiologieEnglisch
Veröffentlicht
Autor Stephen Turner

This week’s recap highlights Evo2 for variant effect analysis and genome design, a preprint showing that pretraining doesn’t necessarily increase performance on genomic foundation models, a new R package ggalign for making complex biological data visualizations with ggplot2, and an ancestral reconstruction method for ancient DNA. I also highlight a few reviews in biodiversity genomics.

AIBiologieEnglisch
Veröffentlicht
Autor Stephen Turner

In a previous post I demonstrated how to set up a local LLM that you can run through either a command line interface (Ollama) or a graphical user interface (Open WebUI and others), and quickly demonstrated how to “chat with your documents” with a local model using LMStudio. In that previous post I simply attached a few documents to a one-off chat.

PapersBiologieEnglisch
Veröffentlicht
Autor Stephen Turner

This week’s recap highlights ESCARGOT, an AI agent for biomedical knowledge graphs and reasoning, CASTER for direct species tree inference from whole-genome alignments, the scGPT-spatial foundation model for spatial transcriptomics, the BioChatter platform for biomedical research applications with LLMs, moscot for mapping cells through time and space, and two reviews: one on epigenetic clocks and another on structural variation in the human

PythonBiologieEnglisch
Veröffentlicht
Autor Stephen Turner

This is part 2 of a series on uv. Other posts in this series: uv, part 1: running scripts and tools This post Coming soon… Last year I wrote a post on creating a Python command line application with Click using a cookiecutter template, building with setuptools and publishing with twine.

PapersBiologieEnglisch
Veröffentlicht
Autor Stephen Turner

Here we are most of the way through March and I’m just getting around to my first “weekly” recap. It’s been a busy month — I’m writing a few papers of my own which I’ll share here when they’re published, and I took some much needed R&R in Portugal where I traded my stack of research papers for some escapist sci-fi. But I’m back now and making my way through a deep backlog.

TILAIBiologieEnglisch
Veröffentlicht
Autor Stephen Turner

Anyone reading this newsletter has surely used the frontier models like ChatGPT, Claude, and Gemini. I’ve written a few posts about using local models but haven’t really talked much about the tools I use to directly interact with these models. Those previous posts interact with local models using tools like ellmer in R or my own biorecap package which interacts with a locally running Ollama server.

PythonBiologieEnglisch
Veröffentlicht
Autor Stephen Turner

This is part 1 of a series on uv. Other posts in this series: This post uv, part 2: building and publishing packages Coming soon… Lately I’ve heard a lot great things about uv, an extremely fast Python package and project manager, written in Rust. After the volume of praise I see about uv reached a critical level I decided to take a look at the docs myself and give it a try.

PapersBiologieEnglisch
Veröffentlicht
Autor Stephen Turner

This week’s recap highlights Verkko2 for T2T genome assembly, the GPN-MSA DNA language model trained on multispecies alignments for variant effect prediction that outperforms other methods like CADD, ESM-1b, phyloP, phastCons, nucleotide transformer, and HyenaDNA, fast orthology inference with FastOMA, a foundation model of transcription across cell types, and engineering CRISPR-Cas PAM sites using deep learning.

PapersBiologieEnglisch
Veröffentlicht
Autor Stephen Turner

It’s been a few weeks since I wrote a recap about what I’m reading. It’s been difficult watching helplessly as the institutions and financial infrastructure underpinning my profession are being systematically and irreversibly dismantled, with brilliant scientists I know personally having their careers destroyed and lives upturned.

R TILBiologieEnglisch
Veröffentlicht
Autor Stephen Turner

Last year I wrote a post describing an R package I put together that fetches recent bioRxiv preprints from a given subject and summarizes them in a couple of sentences using a local LLM running through Ollama: That tool has a limitation in that it’s using the bioRxiv RSS feed to pull recent paper titles and abstracts, and the RSS feeds currently only provide the 30 most recent preprints in each subject area.