
TL;DR: Codeium offers a free Copilot-like experience in Positron. You can install it from the Open VSX registry directly within the extensions pane in Positron.
TL;DR: Codeium offers a free Copilot-like experience in Positron. You can install it from the Open VSX registry directly within the extensions pane in Positron.
This week’s recap highlights a Nextflow pipeline for eQTL detection, an end-to-end pipeline for spatial transcriptomics (visium) data analysis, a method for identification of perturbed cell types in single cell RNA-seq data, a method for guide assignment in single-cell CRISPR screens, a tool for on-target/off-target analysis of gene editing outcomes, and “digital microbes” for collaborative team science on emerging microbes.
Last month I published a paper and an R package for summarizing preprints from bioRxiv using a local LLM. I wrote about it here: Llama 3.2 was just released today (announcement). The biggest news is the addition of a multimodal vision model, but I was intrigued by the reasonably good performance of the tiny 3B text model. I used this as an excuse to update the biorecap R package.
I wrote my first public blog post in 2009. I started Getting Genetics Done to share what I was learning at the end of my PhD/postdoc through my first few years as faculty. Some of the earliest posts were simple, such as how to write and run a simple Perl script, to bigger topics like why it’s usually a bad idea to categorize continuous variables in a linear model.
I recently stumbled across Phil Ewels’s ~18 minute nf-core/bytesize talk on Excalidraw: For years I’ve been using draw.io for making flowcharts and diagrams for documentation, papers, presentations, and for general brainstorming and communication with my team, clients, and collaborators.1 Excalidraw (excalidraw.com) looks like an attractive alternative.
This week’s recap highlights a new nf-core workflow for multi-omics trait association studies, a new tool for linking genotype to phenotype (G2P) by directly sequencing alleles from CRISPR base editing experiments, the SplitsTree app for interactive analysis and visualization using phylogenetic trees and networks, mapping cellular interactions from spatially resolved transcriptomics data, a study of marine microbial diversity and bioprospecting
Google has a new experimental1 tool called Illuminate ( illuminate.google.com ) that takes a link to a preprint2 and creates a podcast discussing the paper. When I tested this with a few preprints, the podcasts it generated are about 6-8 minutes long, featuring a male and female voice discussing the key points of the paper in a conversational style. There are some obvious shortcomings.
This week’s recap highlights a new tool from Wei Shen and Zamin Iqbal for efficient sequence alignment against millions of prokaryotic genomes (LexicMap), a new tool from Heng Li for efficiently constructing and querying a sequence index at scale, an R/Bioconductor package for detecting and correcting DNA contamination in RNA-seq data, a method for dating gene age using synteny, how AlphaFold predictions for some types of conformations are
Llama 3.1 405B is the first open-source LLM on par with frontier models GPT-4o and Claude 3.5 Sonnet. I’ve been running the 70B model locally for a while now using Ollama + Open WebUI, but you’re not going to run the 405B model on your MacBook.
This is an experiment... ignore this post
This week’s recap highlights Google/Deepmind’s new AlphaProteo tool for protein design, tools for protein structure alignment and analysis, biases in polygenic risk scores due to overlap and kinship, highly variable gene selection in single cell RNA-seq, and reconstruction of a 4.2 billion year old last universal common ancestor of life on Earth (spoiler alert: CRISPR-Cas is >4B years old!). Others that caught my attention include a new