BiologíaInglésBlogger

Getting Genetics Done

Getting Things Done in Genetics & Bioinformatics Research
Página de inicio
language
RBiologíaInglés
Publicado
Autor Stephen Turner

This is reposted from the original at https://blog.stephenturner.us/p/python-for-r-users. ---A Google search for “R vs Python” returns thousands of hits across sites like Reddit, IBM, Datacamp, Coursera, Kaggle, and many others.

RBiologíaInglés
Publicado
Autor Stephen Turner

This is reposted from the original at https://blog.stephenturner.us/p/use-nanoparquet-instead-of-readr-csv.Parquet is interoperable between Python and R, fast to read+write, works well with databases, and stores complex data types (e.g., tibble listcols). Use it instead of CSV. Many pros, few (no?) cons. Yesterday I wrote about base R vs. dplyr vs. duckdb for a simple summary analysis.

RBiologíaInglés
Publicado
Autor Stephen Turner

Reposted from https://blog.stephenturner.us/p/duckdb-vs-dplyr-vs-base-r. TL;DR : For a very simple analysis (means by group on 100M rows), duckdb was 125x faster than base R, and 28x faster than readr+dplyr, without having to read data from disk into memory. The duckplyr package wraps DuckDB's analytical query processing techniques in a dplyr-compatible API.

RBiologíaInglés
Publicado
Autor Stephen Turner

This is re-posted from my newsletter, where I'll be posting from now on: https://blog.stephenturner.us/p/biorecap-r-package-for-summarizing-biorxiv-preprints-local-llm ---TL;DR I wrote an R package that summarizes recent bioRxiv preprints using a locally running LLM via Ollama+ollamar, and produces a summary HTML report from a parameterized RMarkdown template.

AnnouncementsRRecommended ReadingBiologíaInglés
Publicado
Autor Stephen Turner

My new blog/newsletter ("Paired Ends") is now at blog.stephenturner.us. I'll be posting semi-regular updates and literature highlights in bioinformatics, computational biology, and data science, along with the occasional post on programming.

ConferencesRBiologíaInglés
Publicado
Autor Stephen Turner

The first ever RStudio conference was held January 11-14, 2017 in Orlando, FL. For anyone else like me who spends hours each working day staring into an RStudio session, the conference was truly excellent . The speaker lineup was diverse and covered lots of areas related to development in R, including the tidyverse, the RStudio IDE, Shiny, htmlwidgets, and authoring with RMarkdown.

Recommended ReadingBiologíaInglés
Publicado
Autor Stephen Turner

I recently stumbled across this collection of computational biology primers in Nature Biotechnology. Many of these are old, but they're still great resources to get a fundamental understanding of the topic. Here they are in no particular order.