The New York Times published this interesting article on how the ability to design and perform computer simulations is a highly marketable skill for careers across many disciplines.In methodology development we use simulation nearly every day.
The New York Times published this interesting article on how the ability to design and perform computer simulations is a highly marketable skill for careers across many disciplines.In methodology development we use simulation nearly every day.
I've linked to UCLA's stat computing resources once before on a previous post about choosing the right analysis for the questions your asking and the data types you have.
The estout package for Stata is useful for quickly creating nicely formatted tables from a regression analysis for tables or papers. To install it, fire up Stata and type in this command:ssc install estout, replaceStata will automatically download and install the package. Run the regression as you normally would, then use the esttab command (part of the estout package) to create a table using those results.
I know that a lot of you are scrambling to spend your training grant money by next week. If you think you'll ever need to use R, I strongly recommend buying this book: Introductory Statistics with R, by Peter Dalgaard ($48, Amazon). I picked this up a while back and read through most of it in a day or two.
Google's chief economist was recently quoted as saying "The sexy job in the next ten years will be statisticians… The ability to take data-to be able to understand it, to process it, to extract value from it, to visualize it, to communicate it-that’s going to be a hugely important skill." I'll leave you for the weekend with this ego-boosting article relating how our skill set as statisticians is a hot commodity in the real world.Dataspora Blog:
Has anyone ever used Galaxy? I saw their presentation at last year's ASHG. Seems like a great way to do collaborate on and keep a record of analyses in an easy web-GUI interface without having to download any software.
Nature Reviews Genetics just published an excellent paper on interaction analysis by Heather Cordell. This masterfully written review starts by defining interaction, then delving into strategies to statistically model it in human genetics studies.
If you're doing an analysis with variables that naturally vary on a continuous scale, like age or smoking pack-years, NEVER be tempted to categorize individuals into groups - there's nearly always a better approach that utilizes the full distribution of values. It may seem convenient for a particular analysis you're doing but you'll take an enormous hit in power and precision.
Last week I posted a short tutorial on how to merge datasets using R. R is a free and open-source statistical computing software and programming language (get R here). The only downside is a steeper learning curve because the documentation is sparse and often difficult to understand at first.
What's your power to detect a recessive effect with an odds ratio of 1.2 for a disease with 4.2% prevalence using 1200 cases and 2900 controls? What if the allele is rare? Is it worth it, in terms of power gain, to genotype 1000 more individuals? How small of an effect can you detect with 80% power using the data you have? These questions and others can be answered by power and sample size calculations.