BiologíaInglésBlogger

Getting Genetics Done

Getting Things Done in Genetics & Bioinformatics Research
Página de inicio
language
AnnouncementsBioinformaticsRBiologíaInglés
Publicado
Autor Stephen Turner

If you're doing any kind of big data analysis - genomics, transcriptomics, proteomics, bioinformatics - then unless you've been on vacation the last few weeks you've no doubt heard about the NSF/NIH BIGDATA  Initiative (here's the NSF solicitation and here's the New York Times article about the funding opportunity). The solicitation "aims to advance core scientific and technological means of managing, analyzing, visualizing, and

BioinformaticsBiologíaInglés
Publicado
Autor Stephen Turner

I was reading through a paper on comparative ChIP-Seq when I found this awk gem that lets you get some very basic stats very quickly on next generation sequencing reads. To use, simply cat the fastq file (or gunzip -c) and pipe that to this awk command: cat myfile.fq | awk '((NR-2)%4==0){read=$1;total++;count[read]++}END{for(read in count){if(!max||count[read]>max) {max=count[read];maxRead=read};if(count[read]==1){unique++}};print

BioinformaticsRBiologíaInglés
Publicado
Autor Stephen Turner

I get asked frequently how to convert from one gene identifier to another. This can be tricky, especially when relying on gene symbols, as Will pointed out in a previous post a few years ago. There are several tools that can do this, including DAVID and the previously mentioned new Biomart ID Converter, but I still prefer using the Ensembl Biomart for this because of its added flexibility and annotation.

AnnouncementsBiologíaInglés
Publicado
Autor Stephen Turner

GGD has a new look. I was inspired by Gina Trapani (Smarterware, Lifehacker) to remove any extra lines, links, and other "ink" that doesn't serve any purpose, and I hope the site appears cleaner and easier to read. I also wanted the extra horizontal space for larger images and avoid the dreaded side-scrolling in posts with lots of code like this one.

BioinformaticsLinuxRBiologíaInglés
Publicado
Autor Stephen Turner

*Edit March 12* Be sure to look at the comments, especially the commentary on Hacker News - you can supercharge the find|xargs idea by using find|parallel instead. --- Do you ever discover a trick to do something better, faster, or easier, and wish you could reclaim all the wasted time and effort before your discovery?

BioinformaticsPathwaysRVisualizationBiologíaInglés
Publicado
Autor Stephen Turner

I get a lot of requests in the core about running a "pathway analysis." Someone ran a handful of gene expression arrays, or better yet, ran an RNA-seq experiment (with replicates!). These, and many other kinds of high-throughput assays (GWAS, ChIP-seq, etc.) result in a list of genes and some associated p-value, fold change, or other statistic. Here's some R code to download public data from a study on susceptibility to colorectal cancer.

BioinformaticsRBiologíaInglés
Publicado
Autor Stephen Turner

I direct the Bioinformatics Core at the University of Virginia, and I'm hiring. Visit this link on the UVA Jobs website for more information. Here's the description: I'm Hiring - Bioinformatics Analyst in the UVA Bioinformatics CoreGetting Genetics Done by Stephen Turner is licensed under a Creative Commons Attribution (CC BY) License.

PubMedBiologíaInglés
Publicado
Autor Stephen Turner

I'm updating my CV and biosketch for a few grant applications, and for some time now, NIH has required you to include the PubMed Central ID for each article you publish that arose from NIH support. I only have a dozen or so papers indexed in PubMed, but I still wanted a way to do this automatically. If you have scores of publications, looking up all the PMCIDs could easily become a hassle. First, create an account at My NCBI.