The New York Times had an interesting piece yesterday about how SAS is facing several business threats from companies like the recently IBM-acquired SPSS, and from burgeoning interest in open-source software like R.
The New York Times had an interesting piece yesterday about how SAS is facing several business threats from companies like the recently IBM-acquired SPSS, and from burgeoning interest in open-source software like R.
The 2009 Cancer Epidemiology, Biostatistics, and Bioinformatics Retreat will be held on Friday, December 4th, 2009, from 1:30 pm to 5:00 pm, on the eighth floor of the VICC building (898B PRB). The purpose of the retreat is to promote interactions among biostatisticians, bioinformaticians, epidemiologists, clinical investigators, and other translational researchers.
Theresa Scott, instructor of the previously mentioned R workshop and weekly R clinic, is giving a lecture entitled "Reproducible Research with R, LaTeX, & Sweave" in MRB III, room 1220, this Wednesday 11/18 at 1:30. You can see more details about the lecture here. Looks like her slides as well as much more introductory material on R, Latex, and Sweave are on her website. Reproducible Research with R, LaTeX, &
Way back will wrote on this topic. See his previous post for Stata code for doing this. Unfortunately the R package that was used to create QQ-plots here has been removed from CRAN, so I wrote my own using ggplot2 and some code I received from Daniel Shriner at NHGRI. Of course you can use R's built-in qqplot() function, but I could never figure out a way to add the diagonal using base graphics.
While flirting around with previously mentioned ggplot2 I came across an incredibly useful set of functions in the plyr package, made by Hadley Wickham, the same guy behind ggplot2. If you've ever used MySQL before, think of "GROUP BY", but here you can arbitrarily apply any R function to splits of the data, or write one yourself.
There are no common disorders - only extremes of quantitative traits. --- That's the argument made by Plomin, Haworth, and Davis in a great review paper just published online in Nature Reviews Genetics.
Often when presenting statistics from a candidate gene study, or a region of interest from a genome-wide association study, it is useful to see various SNP-wise values in the context of linkage disequilibrium patterns.
Aloha! If you're going to the American Society of Human Genetics meeting next week, come by our poster (poster #116, abstract #1567/T) and say hello! We would love to meet any of you who read GGD and hear what you think and what you'd like to see in the future.Getting Genetics Done by Stephen Turner is licensed under a Creative Commons Attribution (CC BY) License.
The Elements of Statistical Learning: Data Mining, Inference, and Prediction by Trevor Hastie, Robert Tibshirani, and Jerome Friedman, one of the best books on data mining and machine learning, is now available free in PDF format. Download it here or view it online here.Getting Genetics Done by Stephen Turner is licensed under a Creative Commons Attribution (CC BY) License.
Strict quality control procedures are extremely important for any genome-wide association study. One of the first steps you should take when running QC on your GWAS is to look for related samples in your dataset. This does two things for you.
R is a great tool with lots of resources for genetics, genome-wide association studies, and many other biological applications. We've covered several places to find help in R in the past, but if you're still apprehensive about diving into R's command-line interface, fear not. The R commander is a graphical user interface (GUI) for R that works under Windows, Linux, and Mac.