Biyolojik BilimlerİngilizceBlogger

Getting Genetics Done

Getting Things Done in Genetics & Bioinformatics Research
Ana Sayfa
language
AnnouncementsBioinformaticsRBiyolojik Bilimlerİngilizce
Yayınlandı
Yazar Stephen Turner

If you're doing any kind of big data analysis - genomics, transcriptomics, proteomics, bioinformatics - then unless you've been on vacation the last few weeks you've no doubt heard about the NSF/NIH BIGDATA  Initiative (here's the NSF solicitation and here's the New York Times article about the funding opportunity). The solicitation "aims to advance core scientific and technological means of managing, analyzing, visualizing, and

BioinformaticsBiyolojik Bilimlerİngilizce
Yayınlandı
Yazar Stephen Turner

I was reading through a paper on comparative ChIP-Seq when I found this awk gem that lets you get some very basic stats very quickly on next generation sequencing reads. To use, simply cat the fastq file (or gunzip -c) and pipe that to this awk command: cat myfile.fq | awk '((NR-2)%4==0){read=$1;total++;count[read]++}END{for(read in count){if(!max||count[read]>max) {max=count[read];maxRead=read};if(count[read]==1){unique++}};print

BioinformaticsRBiyolojik Bilimlerİngilizce
Yayınlandı
Yazar Stephen Turner

I get asked frequently how to convert from one gene identifier to another. This can be tricky, especially when relying on gene symbols, as Will pointed out in a previous post a few years ago. There are several tools that can do this, including DAVID and the previously mentioned new Biomart ID Converter, but I still prefer using the Ensembl Biomart for this because of its added flexibility and annotation.

AnnouncementsBiyolojik Bilimlerİngilizce
Yayınlandı
Yazar Stephen Turner

GGD has a new look. I was inspired by Gina Trapani (Smarterware, Lifehacker) to remove any extra lines, links, and other "ink" that doesn't serve any purpose, and I hope the site appears cleaner and easier to read. I also wanted the extra horizontal space for larger images and avoid the dreaded side-scrolling in posts with lots of code like this one.

BioinformaticsLinuxRBiyolojik Bilimlerİngilizce
Yayınlandı
Yazar Stephen Turner

*Edit March 12* Be sure to look at the comments, especially the commentary on Hacker News - you can supercharge the find|xargs idea by using find|parallel instead. --- Do you ever discover a trick to do something better, faster, or easier, and wish you could reclaim all the wasted time and effort before your discovery?

BioinformaticsPathwaysRVisualizationBiyolojik Bilimlerİngilizce
Yayınlandı
Yazar Stephen Turner

I get a lot of requests in the core about running a "pathway analysis." Someone ran a handful of gene expression arrays, or better yet, ran an RNA-seq experiment (with replicates!). These, and many other kinds of high-throughput assays (GWAS, ChIP-seq, etc.) result in a list of genes and some associated p-value, fold change, or other statistic. Here's some R code to download public data from a study on susceptibility to colorectal cancer.

BioinformaticsRBiyolojik Bilimlerİngilizce
Yayınlandı
Yazar Stephen Turner

I direct the Bioinformatics Core at the University of Virginia, and I'm hiring. Visit this link on the UVA Jobs website for more information. Here's the description: I'm Hiring - Bioinformatics Analyst in the UVA Bioinformatics CoreGetting Genetics Done by Stephen Turner is licensed under a Creative Commons Attribution (CC BY) License.

PubMedBiyolojik Bilimlerİngilizce
Yayınlandı
Yazar Stephen Turner

I'm updating my CV and biosketch for a few grant applications, and for some time now, NIH has required you to include the PubMed Central ID for each article you publish that arose from NIH support. I only have a dozen or so papers indexed in PubMed, but I still wanted a way to do this automatically. If you have scores of publications, looking up all the PMCIDs could easily become a hassle. First, create an account at My NCBI.

Ggplot2RVisualizationBiyolojik Bilimlerİngilizce
Yayınlandı
Yazar Stephen Turner

Title: A Backstage Tour of ggplot2 with Hadley Wickham Date: Wednesday, February 8, 2012 Time: 11:00AM - 12:00PM Pacific Presenter: Hadley Wickham, Professor of Statistics, Rice University Register here.