BiologieEnglischSubstack

Paired Ends

Bioinformatics, computational biology, and data science updates from the field. Occasional posts on programming.
StartseiteRSS-FeedMastodon
language
PapersBiologieEnglisch
Veröffentlicht
Autor Stephen Turner

This week’s recap highlights a new way to turn Nextflow pipelines into web apps, DRAGEN for fast and accurate variant calling, machine-guided design of cell-type-targeting cis-regulatory elements, a Nextflow pipeline for identifying and classifying protein kinases, a new language model for single cell perturbations that integrates knowledge from literature, GeneCards, etc., and a new method for scalable protein design in a relaxed sequence

PapersBiologieEnglisch
Veröffentlicht
Autor Stephen Turner

This week’s recap highlights the WorkflowHub registry for computational workflows, building a virtual cell with AI, a review on bioinformatics methods for prioritizing causal genetic variants in candidate regions, a benchmarking study showing deep learning methods are best for variant calling in bacterial nanopore sequencing, and a new ML model from researchers at Genentech for predicting cell-type- and condition-specific gene expression across

R TILPythonBiologieEnglisch
Veröffentlicht
Autor Stephen Turner

In the spirit of learning in public,1 today I learned about the .keep argument in dplyr. This doesn’t add anything you can’t do with a select or transmute, but might help simplify some of your dplyr pipelines.2 In the examples below I’m using a few rows from the built-in iris dataset to demonstrate how to use the .keep argument by creating a new ratio variable that’s the ratio of the sepal length to width.

R NextflowPythonBiologieEnglisch
Veröffentlicht
Autor Stephen Turner

It’s a short week here in the US. As I reflect on the tools that shape modern bioinformatics and data science it’s striking to see how far we’ve come in the 20 years I’ve been in this field. Today’s ecosystem is rich with tools that make our work faster, better, enjoyable, and increasingly accessible.

PapersBiologieEnglisch
Veröffentlicht
Autor Stephen Turner

This week’s recap highlights pangenome graph construction with nf-core/pangenome, building pangenome graphs with PGGB, benchmarking algorithms for single-cell multi-omics prediction and integration, RNA foundation models, and a Nextflow pipeline for characterizing B cell receptor repertoires from non-targeted bulk RNA-seq data.

R BiologieEnglisch
Veröffentlicht
Autor Stephen Turner

This post is inspired by the Bluesky Network Analyzer made by @theo.io. I’m encouraging everyone I know online to join the scientific community on Bluesky. In that post I link to several starter packs — lists of accounts posting about a topic that you can follow individually or all at once to start filling out your network. I started following accounts of people I knew from X and from a few starter packs I came across.

R NextflowBiologieEnglisch
Veröffentlicht
Autor Stephen Turner

I joined Twitter1 way back in 2009. For nearly 10 years “scitwitter” was an amazing place for discussion, discovery, and engagement with the scientific community. The #Rstats and #pydata hashtags were great places to learn about something new in programming, #icanhazpdf was great for getting papers you didn’t have access to, and conference live-tweeting was common and useful for those of us with FOMO not able to make it in person.

PapersBiologieEnglisch
Veröffentlicht
Autor Stephen Turner

This week’s recap highlights an AI agent for automated multi-omic analysis (AutoBA), rapid species-level metagenome profiling and containment (sylph), a review on genome-wide association analysis beyond SNPs, private information leakage from scRNA-seq count matrices, and a method to “unlearn” viral knowledge in protein language models as a means to develop safe PLM-based variant effect analysis (PROEDIT). Others that caught my attention include

TILPythonBiologieEnglisch
Veröffentlicht
Autor Stephen Turner

In the spirit of Learning in Public, I wanted an excuse to explore (1) click for creating command line interfaces, (2) Cookiecutter project templates, and (3) modern tools in the Python packaging ecosystem. If you’re primarily an R developer like me, I recently wrote about resources for getting better at Python for R users.