Rogue Scholar

GWASRVisualizationBiological Sciences

A Mitochondrial Manhattan Plot

Published November 6, 2013

Manhattan plots have become the standard way to visualize results for genetic association studies, allowing the viewer to instantly see significant results in the rough context of their genomic position. Manhattan plots are typically shown on a linear X-axis (although the circos package can be used for radial plots), and this is consistent with the linear representation of the genome in online genome browsers.

BioinformaticsConferencesRVisualizationBiological Sciences

Archival and analysis of #GI2013 Tweets

https://doi.org/10.59350/11jz7-py935

Published November 4, 2013

Author Stephen Turner

I archived and analyzed all Tweets containing #GI2013 from the recent Cold Spring Harbor Genome Informatics meeting, using my previously described code. Friday was the most Tweeted day.

BioinformaticsRRNA-SeqBiological Sciences

Real-time streaming differential RNA-seq analysis with eXpress

https://doi.org/10.59350/ma6p5-qvh42

Published October 31, 2013

Author Stephen Turner

RNA-seq has been performed routinely for at least 5+ years, yet there is no consensus on the best methodology for analyzing this data.

ConferencesGitRTwitterVisualizationBiological Sciences

Analysis of #ASHG2013 Tweets

https://doi.org/10.59350/gmvz8-e7155

Published October 28, 2013

Author Stephen Turner

I archived and anlayzed all Tweets with the hashtag #ASHG2013 using my previously mentioned code.

PubMedBiological Sciences

PubMed Commons: One post-publication peer review forum to rule them all?

https://doi.org/10.59350/yr9rv-5rb57

Published October 22, 2013

Author Stephen Turner

Several post-publication peer review forums already exist, such as Faculty of 1000 or PubPeer, that facilitate discussion of papers after they have already been published. F1000 only allows a small number of "faculty" to comment on articles, and access to read commentary requires a paid subscription. PubPeer and similar startup services lack a critical mass of participants to make such a community truly useful.

GitGithubLinuxBiological Sciences

Useful Unix/Linux One-Liners for Bioinformatics

https://doi.org/10.59350/6vkcn-04941

Published October 21, 2013

Author Stephen Turner

Much of the work that bioinformaticians do is munging and wrangling around massive amounts of text. While there are some "standardized" file formats (FASTQ, SAM, VCF, etc.) and some tools for manipulating them (fastx toolkit, samtools, vcftools, etc.), there are still times where knowing a little bit of Unix/Linux is extremely helpful, namely awk, sed, cut, grep, GNU parallel, and others.

BioinformaticsRecommended ReadingRNA-SeqTutorialsBiological Sciences

De Novo Transcriptome Assembly with Trinity: Protocol and Videos

https://doi.org/10.59350/9yme1-bcx95

Published October 10, 2013

Author Stephen Turner

One of the clearest advantages RNA-seq has over array-based technology for studying gene expression is not needing a reference genome or a pre-existing oligo array. De novo transcriptome assembly allows you to study non-model organisms, cancer cells, or environmental metatranscriptomes.

BioinformaticsSoftwareBiological Sciences

Utility script for launching bare JAR files

https://doi.org/10.59350/y8r2z-rmk71

Published August 21, 2013

Author Stephen Turner

Torsten Seemann compiled a list of minimum standards for bioinformatics command line tools, things like printing help when no commands are specified, including version info, avoid hardcoded paths, etc. These should be obvious to any seasoned software engineer, but many of these standards are not followed in bioinformatics.

BioinformaticsDatabasesSQLTutorialsBiological Sciences

Understanding the ENSEMBL Schema

https://doi.org/10.59350/pecdt-k4c24

Published August 12, 2013

Author Stephen Turner

ENSEMBL is a frequently used resource for various genomics and transcriptomics tasks. The ENSEMBL website and MART tools provide easy access to their rich database, but ENSEMBL also provides flat-file downloads of their entire database and a public MySQL portal. You can access this using the MySQL Workbench using the following: Once inside, you can get a sense for what the ENSEMBL schema (or data model) is like.

RTutorialsBiological Sciences

Google Developers R Programming Video Lectures

https://doi.org/10.59350/7w827-zc641

Published August 5, 2013

Author Stephen Turner

Google Developers recognized that most developers learn R in bits and pieces, which can leave significant knowledge gaps. To help fill these gaps, they created a series of introductory R programming videos. These videos provide a solid foundation for programming tools, data manipulation, and functions in the R language and software.

BioinformaticsConferencesRSoftwareVisualizationBiological Sciences

Archival, Analysis, and Visualization of #ISMBECCB 2013 Tweets

https://doi.org/10.59350/g3wgz-sa186

Published July 24, 2013

Author Stephen Turner

As the 2013 ISMB/ECCB meeting is winding down, I archived and analyzed the 2000+ tweets from the meeting using a set of bash and R scripts I previously blogged about. The archive of all the tweets tagged #ISMBECCB from July 19-24, 2013 is and will forever remain here on Github. You'll find some R code to parse through this text and run the analyses below in the same repository, explained in more detail in my previous blog post.

Getting Genetics Done

A Mitochondrial Manhattan Plot

Archival and analysis of #GI2013 Tweets

Real-time streaming differential RNA-seq analysis with eXpress

Analysis of #ASHG2013 Tweets

PubMed Commons: One post-publication peer review forum to rule them all?

Useful Unix/Linux One-Liners for Bioinformatics

De Novo Transcriptome Assembly with Trinity: Protocol and Videos

Utility script for launching bare JAR files

Understanding the ENSEMBL Schema

Google Developers R Programming Video Lectures

Archival, Analysis, and Visualization of #ISMBECCB 2013 Tweets