Rogue Scholar

BioinformaticsLinuxRBiologíaInglés

find | xargs ... Like a Boss

Publicado 9 de marzo de 2012

*Edit March 12* Be sure to look at the comments, especially the commentary on Hacker News - you can supercharge the find|xargs idea by using find|parallel instead. --- Do you ever discover a trick to do something better, faster, or easier, and wish you could reclaim all the wasted time and effort before your discovery?

BioinformaticsPathwaysRVisualizationBiologíaInglés

Pathway Analysis for High-Throughput Genomics Studies

https://doi.org/10.59350/ys4fp-ke489

Publicado 6 de marzo de 2012

Autor Stephen Turner

I get a lot of requests in the core about running a "pathway analysis." Someone ran a handful of gene expression arrays, or better yet, ran an RNA-seq experiment (with replicates!). These, and many other kinds of high-throughput assays (GWAS, ChIP-seq, etc.) result in a list of genes and some associated p-value, fold change, or other statistic. Here's some R code to download public data from a study on susceptibility to colorectal cancer.

BioinformaticsRBiologíaInglés

I'm Hiring!

https://doi.org/10.59350/srst0-9kk55

Publicado 24 de febrero de 2012

Autor Stephen Turner

I direct the Bioinformatics Core at the University of Virginia, and I'm hiring. Visit this link on the UVA Jobs website for more information. Here's the description: I'm Hiring - Bioinformatics Analyst in the UVA Bioinformatics CoreGetting Genetics Done by Stephen Turner is licensed under a Creative Commons Attribution (CC BY) License.

PubMedBiologíaInglés

Your Publications (with PMCID) as a PubMed Query

https://doi.org/10.59350/v47eh-mw411

Publicado 17 de febrero de 2012

Autor Stephen Turner

I'm updating my CV and biosketch for a few grant applications, and for some time now, NIH has required you to include the PubMed Central ID for each article you publish that arose from NIH support. I only have a dozen or so papers indexed in PubMed, but I still wanted a way to do this automatically. If you have scores of publications, looking up all the PMCIDs could easily become a hassle. First, create an account at My NCBI.

BioinformaticsBiologíaInglés

Webinar: Genomic Networks - Resolving Biomarkers from a Cloud of Data

https://doi.org/10.59350/p016b-42183

Publicado 8 de febrero de 2012

Autor Stephen Turner

Kevin White from the University of Chicago will be giving a special guest lecture at NCI next week on systems biology approaches to mine genomics data for biomarkers and therapeutic targets. The lecture will be available online as a videocast.

Ggplot2RVisualizationBiologíaInglés

Hadley Wickham: ggplot2 Webinar (Today!)

https://doi.org/10.59350/earkp-qqe36

Publicado 8 de febrero de 2012

Autor Stephen Turner

Title: A Backstage Tour of ggplot2 with Hadley Wickham Date: Wednesday, February 8, 2012 Time: 11:00AM - 12:00PM Pacific Presenter: Hadley Wickham, Professor of Statistics, Rice University Register here.

BiologíaInglés

Joint Techs Netcast: Enhancing Infrastructure Support for Data Intensive Science

https://doi.org/10.59350/ratdh-ej621

Publicado 20 de enero de 2012

Autor Stephen Turner

The winter Joint Techs meeting is next week in Baton Rouge. I'm not going, but I plan on participating via a netcast to see what's going on. Jim Bottum, Clemson's CIO, is moderating an entire day devoted to the topic Enhancing Infrastructure Support for Data Intensive Science. Of particular interest to me are the talks from 9:30-11am Tuesday January 24 from researchers and those supporting climatology, genomics, and the XSEDE projects.

BioinformaticsRSoftwareBiologíaInglés

Annotating limma Results with Gene Names for Affy Microarrays

https://doi.org/10.59350/vw55p-gr892

Publicado 17 de enero de 2012

Autor Stephen Turner

Lately I've been using the limma package often for analyzing microarray data. When I read in Affy CEL files using ReadAffy(), the resulting ExpressionSet won't contain any featureData annotation. Consequentially, when I run topTable to get a list of differentially expressed genes, there's no annotation information other than the Affymetrix probeset IDs or transcript cluster IDs.

ProductivityRTutorialsBiologíaInglés

New Year's Resolution: Learn How to Code

https://doi.org/10.59350/mtxn3-1c431

Publicado 5 de enero de 2012

Autor Stephen Turner

Farhad Manjoo at Slate has a good article on why you need to learn how to program. Chances are, if you're reading this post here you're already fairly adept at some form of programming. But if you're not, you should give it some serious thought.

RBiologíaInglés

Query a MySQL Database from R using RMySQL

https://doi.org/10.59350/ksgk2-vqc08

Publicado 15 de diciembre de 2011

Autor Stephen Turner

I use this all the time, and the setup is dead simple. Follow the code below to load the RMySQL package, connect to a database (here the UCSC genome browser's public MySQL instance), set up a function to make querying easier, and query the database to return results as a data frame. Getting Genetics Done by Stephen Turner is licensed under a Creative Commons Attribution (CC BY) License.

AnnouncementsBioinformaticsWritingBiologíaInglés

Galaxy Project Group on CiteULike and Mendeley

https://doi.org/10.59350/ahgx4-y5v08

Publicado 15 de diciembre de 2011

Autor Stephen Turner

The Galaxy Project started using CiteULike to organize papers that are about, use, or reference Galaxy. The Galaxy CiteULike group is open to any CUL user, and once you join, you can add papers to the group, assign tags, and rate papers.