russpoldrack.org: February 2016

We had a great Town Hall Meeting of our department earlier this week, which was focused on issues around reproducibility, which Mike Frank has already discussed in his blog. A number of the questions that were raised by both faculty and graduate students centered around training, and this has gotten many of us thinking about how we should update our quantitive training to address these concerns. Currently the graduate statistics course is fairly standard, covering basic topics in probability and statistics including basic probability theory, sampling distributions, null hypothesis testing, general(ized) linear models (regression, ANOVA), and mixed models, with exercises done primarily using R. While many of these topics remain essential for psychologists and neuroscientists, it's equally clear that there are a number of other topics that we might want to cover that are highly relevant to issues of reproducibility:

the statistics of reproducibility (e.g., implications of power for predictive validity; Ioannidis, 2005)
Bayesian estimation and inference
bias/variance tradeoffs and regularization
generalization and cross-validation
model-fitting and model comparison

There are also a number of topics that are clearly related to reproducibility but fall more squarely under the topic of "software hygiene":

data management
code validation and testing
version control
reproducible workflows (e.g., virtualization/containerization)
literate programming

I would love to hear your thoughts about what a 21st century graduate statistics course in psychology/neuroscience should cover- please leave comments below!

russpoldrack.org

Friday, February 26, 2016

Reproducibility and quantitative training in psychology