Biological SciencesBlogger

Getting Genetics Done

Getting Things Done in Genetics & Bioinformatics Research
Home Page
language
BioinformaticsGWASPathwaysRecommended ReadingBiological Sciences
Published
Author Stephen Turner

I just read a helpful paper on pathway analysis and interactome reconstruction: Tieri, P., Fuente, A. D., Termanini, A., & Franceschi, C. (2011). Integrating Omics Data for Signaling Pathways, Interactome Reconstruction, and Functional Analysis. In Bioinformatics for Omics Data, Methods in Molecular Biology, vol.

PerlPubMedBiological Sciences
Published

NCBI has put a lot of effort into unifying their data access and retrieval system -- whether you are searching for a gene, protein, or publication, the results are returned in a similar fashion.

BioinformaticsGWASRSoftwareBiological Sciences
Published
Author Stephen Turner

In a previous post I linked to gcol as a quick and intuitive alternative to awk. I just stumbled across yet another set of handy text file manipulation utilities from the creators of the BEAGLE software for GWAS data imputation and analysis.

BioinformaticsRecommended ReadingBiological Sciences
Published
Author Stephen Turner

After evaluating an unnamed bioinformatics core facility, a group of bioinformaticians in Europe wrote up a short list of basic guidelines for organizing a bioinformatics core facility in large research institutes.

SoftwareBiological Sciences
Published
Author Stephen Turner

A while back Will showed you how to ditch Excel for awk, a handy Unix command line tool for extracting certain rows and columns from a text file. While I was browsing the documentation on the previously mentioned PLINK/SEQ library, I came across gcol, another utility for extracting columns from a tab-delimited text file. It can't do anything that awk can't, but it's easier and more intuitive to use for simple text munging tasks.

RSQLBiological Sciences
Published
Author Stephen Turner

Jeffrey Breen put together a useful slideshow on accessing databases from R. I use RODBC every single day to access my own local MySQL server from R. I've had trouble with RMySQL, so I've always used RODBC instead after setting up my localhost MySQL server as a Windows data source. Once you get accustomed to accessing your data directly with SQL queries rather than dumping files you'll wonder why you waited so long.

1000 GenomesBioinformaticsGWASRSequencingBiological Sciences
Published
Author Stephen Turner

PLINK/SEQ is an open source C/C++ library for analyzing large-scale genome sequencing data. The library can be accessed via the pseq command line tool, or through an R interface. The project is developed independently of PLINK but it's syntax will be familiar to PLINK users. PLINK/SEQ boasts an impressive feature set for a project still in the beta testing phase.

BioinformaticsNoteworthy BlogsSequencingSoftwareBiological Sciences
Published
Author Stephen Turner

This is a few months old but I just got around to reading this series of blog posts on next-generation sequencing (NGS) by Gabe Rudy, Golden Helix's VP of product development. This series gives a seriously useful overview of NGS technology, then delves into the analysis of NGS data at each step, right down to a description of the most commonly used file formats and tools for the job. Check it out now if you haven't already.

1000 GenomesLinuxPLINKSequencingSoftwareBiological Sciences
Published
Author Stephen Turner

I recently analyzed some next-generation sequencing data, and I first wanted to compare the frequencies in my samples to those in the 1000 Genomes Project. It turns out this is much easier that I thought, as long as you're a little comfortable with the Linux command line. First, you'll need a Linux system, and two utilities: tabix and vcftools. I'm virtualizing an Ubuntu Linux system in Virtualbox on my Windows 7 machine.

RWritingBiological Sciences
Published
Author Stephen Turner

I love the idea of using R+LaTeX+Sweave for reproducible research. This is even easier now that R has a jazzy new IDE that supports Sweave syntax highlighting and automatic PDF generation. I know I'm going to take some flak for saying this, but let's be honest here... If you're working in the biomedical sciences, chances are, your collaborators have never heard of Sweave. Physicians only use LaTeX during surgery.