Rogue Scholar

LinuxPerlTutorialsBiological Sciences

How To Install BioPerl Without Root Privileges

Published January 13, 2014

Author Stephen Turner

I've seen this question asked and partially answered all around the web. As with anything related to Perl, I'm sure there is more than one way to do it. Here's how I do it with Perl 5.10.1 on CentOS 6.4. First, install local::lib with bootstrapping method as described here.

BioinformaticsMetagenomicsPythonRRecommended ReadingBiological Sciences

Jeff Leek's non-comprehensive list of awesome things other people did in 2013

https://doi.org/10.59350/j87eq-84k69

Published December 31, 2013

Author Stephen Turner

Jeff Leek, biostats professor at Johns Hopkins and instructor of the Coursera Data Analysis course, recently posted on Simly Statistics this list of awesome things other people accomplished in 2013 in genomics, statistics, and data science. At risk of sounding too meta , I'll say that this list itself is one of the awesome things that was put together in 2013.

BioinformaticsSoftwareBiological Sciences

Curoverse raises $1.5M to develop & support an open-source bioinformatics data analysis platform

https://doi.org/10.59350/e8f1k-2ev20

Published December 18, 2013

Author Stephen Turner

Boston-based startup Curoverse has announced $1.5 million in funding to develop and support the open-source Arvados platform for cloud-based bioinformatics & genomics data analysis. The Arvados platform was developed in George Church's lab by scientists and engineers led by Alexander Wait Zaranek, now scientific director at Curoverse.

BioinformaticsDatabasesTutorialsBiological Sciences

Biostar Tutorial: Cheat sheet for one-based vs zero-based coordinate systems

https://doi.org/10.59350/9y10w-1y991

Published December 9, 2013

Author Stephen Turner

Obi Griffith over at Biostar put together this excellent cheat sheet for dealing with one-based and zero-based genomic coordinate systems. The cheat sheet visually explains the difference between zero and one-based coordinate systems, as well as how to indicate a position, SNP, range, or indel using both coordinate systems.

AnnotationBioinformaticsGWASPLINKSQLBiological Sciences

Using Database Joins to Compare Results Sets

https://doi.org/10.59350/xwgvw-4xg85

Published November 20, 2013

One of the most powerful tools you can learn to use in genomics research is a relational database system, such as MySQL. These systems are fairly easy to setup and use, and provide users the ability to organize and manipulate data and statistical results with simple commands. As a graduate student (during the height of GWAS), this single skill quickly turned me into an “expert”.

GWASRVisualizationBiological Sciences

A Mitochondrial Manhattan Plot

https://doi.org/10.59350/dvd1d-ywx41

Published November 6, 2013

Manhattan plots have become the standard way to visualize results for genetic association studies, allowing the viewer to instantly see significant results in the rough context of their genomic position. Manhattan plots are typically shown on a linear X-axis (although the circos package can be used for radial plots), and this is consistent with the linear representation of the genome in online genome browsers.

BioinformaticsConferencesRVisualizationBiological Sciences

Archival and analysis of #GI2013 Tweets

https://doi.org/10.59350/11jz7-py935

Published November 4, 2013

Author Stephen Turner

I archived and analyzed all Tweets containing #GI2013 from the recent Cold Spring Harbor Genome Informatics meeting, using my previously described code. Friday was the most Tweeted day.

BioinformaticsRRNA-SeqBiological Sciences

Real-time streaming differential RNA-seq analysis with eXpress

https://doi.org/10.59350/ma6p5-qvh42

Published October 31, 2013

Author Stephen Turner

RNA-seq has been performed routinely for at least 5+ years, yet there is no consensus on the best methodology for analyzing this data.

ConferencesGitRTwitterVisualizationBiological Sciences

Analysis of #ASHG2013 Tweets

https://doi.org/10.59350/gmvz8-e7155

Published October 28, 2013

Author Stephen Turner

I archived and anlayzed all Tweets with the hashtag #ASHG2013 using my previously mentioned code. Number of Tweets by date shows Wednesday was the most Tweeted day: The top used hashtags other than #ASHG2013: The most prolific users: And what Twitter analysis would be complete without the widely loved, and more widely hated word cloud: Edit 8:24am : I have gotten notes that some Tweets were not captured in this archive.

PubMedBiological Sciences

PubMed Commons: One post-publication peer review forum to rule them all?

https://doi.org/10.59350/yr9rv-5rb57

Published October 22, 2013

Author Stephen Turner

Several post-publication peer review forums already exist, such as Faculty of 1000 or PubPeer, that facilitate discussion of papers after they have already been published. F1000 only allows a small number of "faculty" to comment on articles, and access to read commentary requires a paid subscription. PubPeer and similar startup services lack a critical mass of participants to make such a community truly useful.

GitGithubLinuxBiological Sciences

Useful Unix/Linux One-Liners for Bioinformatics

https://doi.org/10.59350/6vkcn-04941

Published October 21, 2013

Author Stephen Turner

Much of the work that bioinformaticians do is munging and wrangling around massive amounts of text. While there are some "standardized" file formats (FASTQ, SAM, VCF, etc.) and some tools for manipulating them (fastx toolkit, samtools, vcftools, etc.), there are still times where knowing a little bit of Unix/Linux is extremely helpful, namely awk, sed, cut, grep, GNU parallel, and others.

Getting Genetics Done

How To Install BioPerl Without Root Privileges

Jeff Leek's non-comprehensive list of awesome things other people did in 2013

Curoverse raises $1.5M to develop & support an open-source bioinformatics data analysis platform

Biostar Tutorial: Cheat sheet for one-based vs zero-based coordinate systems

Using Database Joins to Compare Results Sets

A Mitochondrial Manhattan Plot

Archival and analysis of #GI2013 Tweets

Real-time streaming differential RNA-seq analysis with eXpress

Analysis of #ASHG2013 Tweets

PubMed Commons: One post-publication peer review forum to rule them all?

Useful Unix/Linux One-Liners for Bioinformatics