Biyolojik BilimlerİngilizceSubstack

Paired Ends

Bioinformatics, computational biology, and data science updates from the field. Occasional posts on programming.
Ana SayfaRSS BeslemeMastodon
language
R AIBiyolojik Bilimlerİngilizce
Yayınlandı
Yazar Stephen Turner

Last month I published a paper and an R package for summarizing preprints from bioRxiv using a local LLM. I wrote about it here: Llama 3.2 was just released today (announcement). The biggest news is the addition of a multimodal vision model, but I was intrigued by the reasonably good performance of the tiny 3B text model. I used this as an excuse to update the biorecap R package.

TILBiyolojik Bilimlerİngilizce
Yayınlandı
Yazar Stephen Turner

Cite: Stephen Turner. “Learning in Public.” Paired Ends (2024). DOI: https://doi.org/10.59350/xwgsf-nj906. I wrote my first public blog post in 2009. I started Getting Genetics Done to share what I was learning at the end of my PhD/postdoc through my first few years as faculty.

TILBiyolojik Bilimlerİngilizce
Yayınlandı
Yazar Stephen Turner

I recently stumbled across Phil Ewels’s ~18 minute nf-core/bytesize talk on Excalidraw: For years I’ve been using draw.io for making flowcharts and diagrams for documentation, papers, presentations, and for general brainstorming and communication with my team, clients, and collaborators.1 Excalidraw (excalidraw.com) looks like an attractive alternative.

PapersBiyolojik Bilimlerİngilizce
Yayınlandı
Yazar Stephen Turner

This week’s recap highlights a new nf-core workflow for multi-omics trait association studies, a new tool for linking genotype to phenotype (G2P) by directly sequencing alleles from CRISPR base editing experiments, the SplitsTree app for interactive analysis and visualization using phylogenetic trees and networks, mapping cellular interactions from spatially resolved transcriptomics data, a study of marine microbial diversity and bioprospecting

TILAIBiyolojik Bilimlerİngilizce
Yayınlandı
Yazar Stephen Turner

Google has a new experimental1 tool called Illuminate ( illuminate.google.com ) that takes a link to a preprint2 and creates a podcast discussing the paper. When I tested this with a few preprints, the podcasts it generated are about 6-8 minutes long, featuring a male and female voice discussing the key points of the paper in a conversational style. There are some obvious shortcomings.

PapersBiyolojik Bilimlerİngilizce
Yayınlandı
Yazar Stephen Turner

This week’s recap highlights a new tool from Wei Shen and Zamin Iqbal for efficient sequence alignment against millions of prokaryotic genomes (LexicMap), a new tool from Heng Li for efficiently constructing and querying a sequence index at scale, an R/Bioconductor package for detecting and correcting DNA contamination in RNA-seq data, a method for dating gene age using synteny, how AlphaFold predictions for some types of conformations are

PapersBiyolojik Bilimlerİngilizce
Yayınlandı
Yazar Stephen Turner

This week’s recap highlights Google/Deepmind’s new AlphaProteo tool for protein design, tools for protein structure alignment and analysis, biases in polygenic risk scores due to overlap and kinship, highly variable gene selection in single cell RNA-seq, and reconstruction of a 4.2 billion year old last universal common ancestor of life on Earth (spoiler alert: CRISPR-Cas is >4B years old!). Others that caught my attention include a new

R Biyolojik Bilimlerİngilizce
Yayınlandı
Yazar Stephen Turner

VP (Pete) Nagraj is a long time friend, colleague, and collaborator, and is the author of this post. Pete and I have co-authored over a dozen publications, and have taught several graduate courses in data science together. Pete leads the health security analytics / infectious disease modeling and forecasting group at Signature Science, where we started this work together last year.

PapersBiyolojik Bilimlerİngilizce
Yayınlandı
Yazar Stephen Turner

This week’s recap highlights perspectives in de-extinction and patent law, systematic benchmarking of scATAC-seq methods, a 91 gigabase (!) animal genome, structural variant genotyping with long reads, cell type-specific enhancer prediction, and a perspective piece in AI in biosecurity.