Rogue Scholar

PapersBiological Sciences

Weekly Recap (Jan 2025 part 2)

Published January 10, 2025

Author Stephen Turner

I'm still catching up on papers from my late 2024 backlog. This week’s recap highlights autonomous microbial sensors for detecting TNT in soil, genome size estimation from long reads, STABIX for indexing and compressing GWAS summary statistics, and Clair3-RNA for deep learning-based small variant calling on long-read RNA-seq data.

AIBiological Sciences

Gene Info Custom GPT

https://doi.org/10.59350/a48aw-1sq69

Published January 6, 2025

Author Stephen Turner

OpenAI introduced the ability to create custom GPTs back in November 2023. I wanted to try to create one of these, and in the spirit of learning in public this post describes how I made it. But first, what does it do?Gene Info Custom GPT Gene Info custom GPT The Gene Info custom GPT takes a list of human gene symbols as input.

PapersBiological Sciences

Weekly Recap (Jan 2025, part 1)

https://doi.org/10.59350/2zjt7-tqb76

Published January 3, 2025

Author Stephen Turner

Happy New Year! I’m still catching up on papers from my late 2024 backlog.

R AIBiological Sciences

Bluesky conversation analysis with local and frontier LLMs with R/Tidyverse

https://doi.org/10.59350/rzc7w-qkb06

Published December 30, 2024

Author Stephen Turner

Background Bluesky, atrrr, local LLMs I’ve written a few posts lately about Bluesky — first, Bluesky for Science, about Bluesky as a home for Science Twitter expats after the mass eXodus, another on using the atrrr package to expand your Bluesky network. I’ve also spent some time looking at R packages to provide an interface to Ollama.

AIBiological Sciences

The Enlightenment Conservatory

https://doi.org/10.59350/94aaz-m9h32

Published December 23, 2024

Author Stephen Turner

I had good intentions to give NaNoWriMo a try this year but didn’t get very far. Instead I gave OpenAI’s Creative Writing Coach GPT a try for a (very) short story I had in mind, inspired by my frustration trying to access closed-access research articles for a review article I’m preparing.

Biological Sciences

What I'm reading: de-extinction edition

https://doi.org/10.59350/892gr-q1e17

Published December 21, 2024

Author Stephen Turner

The Baader–Meinhof phenomenon (aka the frequency illusion) is the name for that thing that happens when you buy a new car, and suddenly you notice that same model car everywhere you drive.

PapersBiological Sciences

Weekly Recap (Dec 2024, part 3)

https://doi.org/10.59350/p4rme-k4119

Published December 20, 2024

Author Stephen Turner

This week’s recap highlights the Evo model for sequence modeling and design, biomedical discovery with AI agents, improving bioinformatics software quality through teamwork, a new tool from Brent Pedersen and Aaron Quinlan (vcfexpress) for filtering and formatting VCFs with Lua expressions, a new paper about the NHGRI-EBI GWAS Catalog, and a review paper on designing and engineering synthetic genomes.

AIBiological Sciences

Video to audio to transcript to summary using local AI: whisperfile and llama3.3

https://doi.org/10.59350/bjvsq-cqg11

Published December 18, 2024

Author Stephen Turner

A few days ago I wrote about translating R package help documentation using a local LLM (e.g. llama3.x)… …when Mick Watson commented: I was already thinking of wiring up something like this using local AI models — something to summarize podcasts, conference recordings, etc. The relatively new (as of this writing) Gemini 2.0 Flash model will do this for you for YouTube videos. But what if you wanted to do this offline using a local LLM?

TILAIBiological Sciences

Turn any webpage into markdown for LLM-friendly input

https://doi.org/10.59350/g0y96-dwq81

Published December 16, 2024

Author Stephen Turner

Last week I posted about a web app that turns a GitHub repo into a single text file for LLM-friendly input. This is great for capturing LLM-friendly text from a GitHub repo, but what about any other arbitrary website or PDF? I was catching up on Simon Willison’s newsletter reading about an app he made with Claude artifacts that uses the Jina Reader API to generate Markdown from a website. You don’t need to use the API to do this.

R AIBiological Sciences

Use an LLM to translate help documentation on-the-fly

https://doi.org/10.59350/e574q-9pe41

Published December 14, 2024

Author Stephen Turner

Using LLMs in R Most of the developer tooling for AI/LLM training and evaluation is Python-centric, but just over the past few months we’ve seen a surge of new tooling for AI/LLM applications for the R ecosystem. ollamar and rollama provide wrappers around the Ollama API allowing you to run LLMs locally on your machine.

PapersBiological Sciences

Weekly Recap (Dec 2024, part 2)

https://doi.org/10.59350/dwjza-ajz10

Published December 13, 2024

Author Stephen Turner

This week’s recap highlights a new way to turn Nextflow pipelines into web apps, DRAGEN for fast and accurate variant calling, machine-guided design of cell-type-targeting cis-regulatory elements, a Nextflow pipeline for identifying and classifying protein kinases, a new language model for single cell perturbations that integrates knowledge from literature, GeneCards, etc., and a new method for scalable protein design in a relaxed sequence

Paired Ends

Weekly Recap (Jan 2025 part 2)

Gene Info Custom GPT

Weekly Recap (Jan 2025, part 1)

Bluesky conversation analysis with local and frontier LLMs with R/Tidyverse

The Enlightenment Conservatory

What I'm reading: de-extinction edition

Weekly Recap (Dec 2024, part 3)

Video to audio to transcript to summary using local AI: whisperfile and llama3.3

Turn any webpage into markdown for LLM-friendly input

Use an LLM to translate help documentation on-the-fly

Weekly Recap (Dec 2024, part 2)