Biyolojik BilimlerİngilizceSubstack

Paired Ends

Bioinformatics, computational biology, and data science updates from the field. Occasional posts on programming.
Ana SayfaRSS BeslemeMastodon
language
PapersBiyolojik Bilimlerİngilizce
Yayınlandı
Yazar Stephen Turner

This week’s recap highlights a new way to turn Nextflow pipelines into web apps, DRAGEN for fast and accurate variant calling, machine-guided design of cell-type-targeting cis-regulatory elements, a Nextflow pipeline for identifying and classifying protein kinases, a new language model for single cell perturbations that integrates knowledge from literature, GeneCards, etc., and a new method for scalable protein design in a relaxed sequence

PapersBiyolojik Bilimlerİngilizce
Yayınlandı
Yazar Stephen Turner

This week’s recap highlights the WorkflowHub registry for computational workflows, building a virtual cell with AI, a review on bioinformatics methods for prioritizing causal genetic variants in candidate regions, a benchmarking study showing deep learning methods are best for variant calling in bacterial nanopore sequencing, and a new ML model from researchers at Genentech for predicting cell-type- and condition-specific gene expression across

R TILPythonBiyolojik Bilimlerİngilizce
Yayınlandı
Yazar Stephen Turner

In the spirit of learning in public,1 today I learned about the .keep argument in dplyr. This doesn’t add anything you can’t do with a select or transmute, but might help simplify some of your dplyr pipelines.2 In the examples below I’m using a few rows from the built-in iris dataset to demonstrate how to use the .keep argument by creating a new ratio variable that’s the ratio of the sepal length to width.

R NextflowPythonBiyolojik Bilimlerİngilizce
Yayınlandı
Yazar Stephen Turner

It’s a short week here in the US. As I reflect on the tools that shape modern bioinformatics and data science it’s striking to see how far we’ve come in the 20 years I’ve been in this field. Today’s ecosystem is rich with tools that make our work faster, better, enjoyable, and increasingly accessible.

PapersBiyolojik Bilimlerİngilizce
Yayınlandı
Yazar Stephen Turner

This week’s recap highlights pangenome graph construction with nf-core/pangenome, building pangenome graphs with PGGB, benchmarking algorithms for single-cell multi-omics prediction and integration, RNA foundation models, and a Nextflow pipeline for characterizing B cell receptor repertoires from non-targeted bulk RNA-seq data.

R Biyolojik Bilimlerİngilizce
Yayınlandı
Yazar Stephen Turner

This post is inspired by the Bluesky Network Analyzer made by @theo.io. I’m encouraging everyone I know online to join the scientific community on Bluesky. In that post I link to several starter packs — lists of accounts posting about a topic that you can follow individually or all at once to start filling out your network. I started following accounts of people I knew from X and from a few starter packs I came across.

R NextflowBiyolojik Bilimlerİngilizce
Yayınlandı
Yazar Stephen Turner

I joined Twitter1 way back in 2009. For nearly 10 years “scitwitter” was an amazing place for discussion, discovery, and engagement with the scientific community. The #Rstats and #pydata hashtags were great places to learn about something new in programming, #icanhazpdf was great for getting papers you didn’t have access to, and conference live-tweeting was common and useful for those of us with FOMO not able to make it in person.

PapersBiyolojik Bilimlerİngilizce
Yayınlandı
Yazar Stephen Turner

This week’s recap highlights an AI agent for automated multi-omic analysis (AutoBA), rapid species-level metagenome profiling and containment (sylph), a review on genome-wide association analysis beyond SNPs, private information leakage from scRNA-seq count matrices, and a method to “unlearn” viral knowledge in protein language models as a means to develop safe PLM-based variant effect analysis (PROEDIT). Others that caught my attention include

TILPythonBiyolojik Bilimlerİngilizce
Yayınlandı
Yazar Stephen Turner

In the spirit of Learning in Public, I wanted an excuse to explore (1) click for creating command line interfaces, (2) Cookiecutter project templates, and (3) modern tools in the Python packaging ecosystem. If you’re primarily an R developer like me, I recently wrote about resources for getting better at Python for R users.