BiologiaIngleseSubstack

Paired Ends

Bioinformatics, computational biology, and data science updates from the field. Occasional posts on programming.
Pagina inizialeRSS ForaggioMastodon
language
BiologiaInglese
Pubblicato

I covered Autocycler (paper, code, docs) in last week’s recap: From the abstract: Here’s a schematic of the workflow: And some benchmarks:Subscribe now Demo I wanted to try this tool out myself. I followed the demo dataset described in the Autocycler docs, which contains ONT reads from a few E. coli plasmids, and mostly used the same code provided in the docs to run Autocycler on this data.

PapersBiologiaInglese
Pubblicato
Autore Stephen Turner

This week’s recap highlights Autocycler for long-read consensus assembly for bacterial genomes (future post on this one alone coming soon), Progen3 for broader generation and deeper functional understanding of proteins, the CarpeDeam de novo metagenome assembler for ancient datasets, and hifiasm ONT for efficient T2T assembly of Nanopore Simplex reads.

PapersBiologiaInglese
Pubblicato

I’m thrilled to share the publication of our new paper published today in Nature Reviews Biodiversity : You can read the paper (free) here: https://rdcu.be/ewG5R.Read the paper (free) This Perspective paper was a global collaboration between Colossal Biosciences, the University of East Anglia, the Globe institute at the University of Copenhagen, the Mauritian Wildlife Foundation, Durrell Wildlife Conservation Trust, the government of

R AIBiologiaInglese
Pubblicato

I have a little hobby project I’m working on and I wanted to use the opportunity to fully make the switch to Positron from RStudio. I used Positron here and there when it first came out, but now that it’s out of beta and has a more complete feature set (like remote SSH sessions!) I have everything I need to switch and not look back.

R AIBiologiaInglese
Pubblicato

Note: After I wrote this post last week, the Tidyverse team released ragnar 0.2.0 on July 12. Everything here should still work, but take a look at the release notes to learn about some nice new features that aren’t covered here. I’ve written a little about retrieval-augmented generation (RAG) here before.

PapersBiologiaInglese
Pubblicato
Autore Stephen Turner

This week’s recap highlights an interesting new model of deep ancestral structure shared by humans unearthed using a new coalescent-based HMM (cobraa), a genomic language model for predicting enhancers and their allele-specific activity, atom-level enzyme active site scaffolding using RFdiffusion2, and a new perspective article on multimodal foundation models in biology.

R BiologiaInglese
Pubblicato

karyoploteR is an R package that’s been in Bioconductor for nearly a decade. It lets you create linear chromosomal representations of any genome with genomic annotations and experimental data plotted along them. Bioconductor : https://bioconductor.org/packages/karyoploteR/ Tutorial : https://bernatgel.github.io/karyoploter_tutorial/ Paper : Bernat Gel & Eduard Serra. (2017).

PapersBiologiaInglese
Pubblicato
Autore Stephen Turner

This week’s recap highlights the Rust-based wgatools for manipulating alignments and visualizing in the terminal, the nf-core scnanoseq Nextflow pipeline for ONT scRNA-seq, sawfish for better SV discovery and genotyping with long reads, the BINSEQ high-performance binary formats for nucleotide sequence data, and a unified analysis of atlas single-cell data.

BiologiaInglese
Pubblicato

I originally wrote and published this essay at The Connected Ideas Project, an excellent newsletter by my good friend and colleague Alexander Titus. If you’re not reading TCIP you’re missing out. After I finished my postdoc I was faculty in academia for eight years before moving to a consulting firm for five years, then joined a biotech startup two years ago.

PapersBiologiaInglese
Pubblicato

This week’s recap highlights the new Datavzrd tool for interactive visualization and communication of tabular data (I’m genuinely really looking forward to trying this one), tracing the shared foundations of gene expression and chromatin structure, PISA for visualizing cis-regulatory rules in genomic data, fast protein structure searching using structure graph embeddings, and a review/perspective on intrinsically disordered regions as