BiologíaInglésBlogger

Getting Genetics Done

Getting Things Done in Genetics & Bioinformatics Research
Página de inicio
language
Recommended ReadingBiologíaInglés
Publicado
Autor Stephen Turner

I recently stumbled across this collection of computational biology primers in Nature Biotechnology. Many of these are old, but they're still great resources to get a fundamental understanding of the topic. Here they are in no particular order.

BiologíaInglés
Publicado
Autor Stephen Turner

I came across this awesome gist explaining how to syntax highlight code in Keynote. The same trick works for Powerpoint. Mac only. Install homebrew if you don’t have it already and brew install highlight. highlight -O rtf myfile.ext | pbcopy to highlight code to a formatted text converter in RTF output format, and copy the result to the system clipboard. Paste into Keynote or Powerpoint.

RWeb AppsBiologíaInglés
Publicado
Autor Stephen Turner

How many reads do I need? What's my sequencing depth? These are common questions I get all the time. Calculating how much sequence data you need to hit a target depth of coverage, or the inverse, what's the coverage depth given a set amount of sequencing, are both easy to answer with some basic algebra. Given one or the other, plus the genome size and read length/configuration, you can calculate either.

ConferencesRBiologíaInglés
Publicado
Autor Stephen Turner

This is a guest post from VP Nagraj, a data scientist embedded within UVA’s Health Sciences Library, who runs our Data Analysis Support Hub (DASH) service. Last weekend I was fortunate enough to be able to participate in the first ever Shiny Developer Conference hosted by RStudio at Stanford University. I’ve built a handful of apps, and have taught an introductory workshop on Shiny.

Ggplot2RVisualizationBiologíaInglés
Publicado
Autor Stephen Turner

A while back I showed you how to make volcano plots in base R for visualizing gene expression results. This is just one of many genome-scale plots where you might want to show all individual results but highlight or call out important results by labeling them, for example, with a gene name. But if you want to annotate lots of points, the annotations usually get so crowded that they overlap one another and become illegible.

RWeb AppsBiologíaInglés
Publicado
Autor Stephen Turner

This is a guest post from VP Nagraj, a data scientist embedded within UVA’s Health Sciences Library, who runs our Data Analysis Support Hub (DASH) service. The What GRUPO (Gauging Research University Publication Output) is a Shiny app that provides side-by-side benchmarking of American research university publication activity.

BioinformaticsDatabasesDplyrRBiologíaInglés
Publicado
Autor Stephen Turner

I work with gene lists on a nearly daily basis. Lists of genes near ChIP-seq peaks, lists of genes closest to a GWAS hit, lists of differentially expressed genes or transcripts from an RNA-seq experiment, lists of genes involved in certain pathways, etc. And lots of times I’ll need to convert these gene IDs from one identifier to another. There’s no shortage of tools to do this. I use Ensembl Biomart.

BioinformaticsConferencesMetagenomicsRRNA-SeqBiologíaInglés
Publicado
Autor Stephen Turner

I just returned from the Genome Informatics meeting at Cold Spring Harbor. This was, hands down, the best scientific conference I've been to in years. The quality of the talks and posters was excellent, and it was great meeting in person many of the scientists and developers whose tools and software I use on a daily basis.

RTutorialsBiologíaInglés
Publicado
Autor Stephen Turner

The problem I was looking for a way to compile an RMarkdown document and have the filename of the resulting PDF or HTML document contain the name of the input data that it processed. That is, if I compiled the analysis.Rmd file, where in that file it did some analysis and reporting on data001.txt, I’d want the resulting filename to look something like data001.txt.analysis.html.

RTutorialsVisualizationBiologíaInglés
Publicado
Autor Stephen Turner

I forgot where I originally found the code to do this, but I recently had to dig it out again to remind myself how to draw two different y axes on the same plot to show the values of two different features of the data. This is somewhat distinct from the typical use case of aesthetic mappings in ggplot2 where I want to have different lines/points/colors/etc. for the same feature across multiple subsets of data.