Biyolojik BilimlerİngilizceSubstack

Paired Ends

Bioinformatics, computational biology, and data science updates from the field. Occasional posts on programming.
Ana SayfaRSS BeslemeMastodon
language
R AIBiyolojik Bilimlerİngilizce
Yayınlandı
Yazar Stephen Turner

I have a little hobby project I’m working on and I wanted to use the opportunity to fully make the switch to Positron from RStudio. I used Positron here and there when it first came out, but now that it’s out of beta and has a more complete feature set (like remote SSH sessions!) I have everything I need to switch and not look back.

R AIBiyolojik Bilimlerİngilizce
Yayınlandı
Yazar Stephen Turner

Note: After I wrote this post last week, the Tidyverse team released ragnar 0.2.0 on July 12. Everything here should still work, but take a look at the release notes to learn about some nice new features that aren’t covered here. I’ve written a little about retrieval-augmented generation (RAG) here before.

PapersBiyolojik Bilimlerİngilizce
Yayınlandı
Yazar Stephen Turner

This week’s recap highlights an interesting new model of deep ancestral structure shared by humans unearthed using a new coalescent-based HMM (cobraa), a genomic language model for predicting enhancers and their allele-specific activity, atom-level enzyme active site scaffolding using RFdiffusion2, and a new perspective article on multimodal foundation models in biology.

R Biyolojik Bilimlerİngilizce
Yayınlandı
Yazar Stephen Turner

karyoploteR is an R package that’s been in Bioconductor for nearly a decade. It lets you create linear chromosomal representations of any genome with genomic annotations and experimental data plotted along them. Bioconductor : https://bioconductor.org/packages/karyoploteR/ Tutorial : https://bernatgel.github.io/karyoploter_tutorial/ Paper : Bernat Gel & Eduard Serra. (2017).

PapersBiyolojik Bilimlerİngilizce
Yayınlandı
Yazar Stephen Turner

This week’s recap highlights the Rust-based wgatools for manipulating alignments and visualizing in the terminal, the nf-core scnanoseq Nextflow pipeline for ONT scRNA-seq, sawfish for better SV discovery and genotyping with long reads, the BINSEQ high-performance binary formats for nucleotide sequence data, and a unified analysis of atlas single-cell data.

Biyolojik Bilimlerİngilizce
Yayınlandı
Yazar Stephen Turner

I originally wrote and published this essay at The Connected Ideas Project, an excellent newsletter by my good friend and colleague Alexander Titus. If you’re not reading TCIP you’re missing out. After I finished my postdoc I was faculty in academia for eight years before moving to a consulting firm for five years, then joined a biotech startup two years ago.

PapersBiyolojik Bilimlerİngilizce
Yayınlandı
Yazar Stephen Turner

This week’s recap highlights the new Datavzrd tool for interactive visualization and communication of tabular data (I’m genuinely really looking forward to trying this one), tracing the shared foundations of gene expression and chromatin structure, PISA for visualizing cis-regulatory rules in genomic data, fast protein structure searching using structure graph embeddings, and a review/perspective on intrinsically disordered regions as

PythonBiyolojik Bilimlerİngilizce
Yayınlandı
Yazar Stephen Turner

This is part 4 of a series on uv. Other posts in this series: uv, part 1: running scripts and tools uv, part 2: building and publishing packages uv, part 3: Python in R with reticulate I’ve never been a big fan of notebooks, and I’m not the only one. Out of order code execution, hidden state, difficulty diffing in version control, output bloat, etc.

AIBiyolojik Bilimlerİngilizce
Yayınlandı
Yazar Stephen Turner

One of my previous employers was a Google Cloud partner, which gave me full and free access to all of Google Cloud’s certification programs, where I took the Professional Cloud Architect and Professional Data Engineer programs. It shouldn’t surprise anyone that with Google leaning hard into GenAI that they have new certification programs and learning paths, like this Generative AI Leader certification.

PapersBiyolojik Bilimlerİngilizce
Yayınlandı
Yazar Stephen Turner

This week’s recap highlights PhyloSketch for interactively drawing and manipulating phylogenies, Uncalled4 for nanopore DNA and RNA modification detection, Severus for SV calling from long reads, CREsted for modeling synthetic cell type-specific enhancers, and a review on transformers and genome language models.

R AIBiyolojik Bilimlerİngilizce
Yayınlandı
Yazar Stephen Turner

There was a time in late 2023 to early 2024 when I and probably many others in the R community felt like R was falling woefully behind Python in tooling for development using AI and LLMs. This is no longer the case. The R community, and Posit in particular, have been on an absolute tear bringing new packages online to take advantage of all the capabilities that LLMs provide.