NHST in the 21st Century


Objections to null hypothesis significance testing (NHST) mounted throughout the last half of the 20th century, with Hunter's (1997) Needed: A Ban on the Significance Test being perhaps the most dramatic. Articles, conferences, and communications proliferated. In March 2019, the American Statistical Association published 43 papers (all Open Access) in a special issue of The American Statistician. The table of contents is https://tandfonline.com/toc/utas20/73/sup1?nav=tocList.

The professional development course that follows provides five references to papers and videos. Chosen to give overviews, advice, and materials for teachers of statistics, all are freely available. Additional recommendations follow. I'll end this introduction with my summary conclusion, a haiku.

Our training pointed
To .05 as success.
Wrong. No longer true.

The Course

  1. A good place to start is one of the six videos by Geoff Cumming on the APS website. Cumming uses simulation experiments to repeatedly draw two samples from known populations. The p values are so variable they promote distrust.

    Video, Part 1. (27 minutes)

  2. "Moving to a world beyond 'p < 0.05.'" is the lead editorial in The American Statistician's special issue. It addresses reasons not to trust p values, the difficulty of moving beyond NHST, new approaches to data analysis, and recommendations of how to proceed. (20 pages).

    Ronald L. Wasserstein, Allen L. Schirm, & Nicole A. Lazar

  3. One of the popular solutions to avoid NHST is called the "new statistics." In this video Geoff Cumming uses simulation to promote effect size indexes and 95 percent confidence intervals in place of p values.

    Video, Part 3 (35 minutes)

  4. This 8-page history describes how Fisher's p < 0.05 and Neyman and Pearson's Type I and Type II errors were amalgamated into NHST. Valuable history for statistics teachers.

    Lee Kennedy-Shaffer    
    "Before p < 0.05 to Beyond p < 0.05: Using History to Contextualize p-Values and Significance Testing"

  5. This paper outlines an undergraduate course that goes beyond data analysis to address the place of statistical thinking within research. Best practices for statistical communication are identified.

    E. Ashley Steel, Martin Liermann, & Peter Guttrop
    "Beyond Calculations: A Course in Statistical Thinking" (10 pages)

Additional References  

  1. Two references that place statistics in the larger endeavor of science:

    1. Distinguishes between statistical inference (finding differences) and scientific inference (a search for widespread agreement).

      Raymond Hubbard, Brian D. Haig, & Rahul A. Parsa
      "The Limited Role of Formal Statistical Inference in Scientific Inference" (8 pages)

    2. Lists concerns about NHST and the larger issue of how we should conduct science.

      Shrout, P. E. & Rodgers, J. L.
      Annual Review of Psychology. (2018, pp. 487-510)  
      "Psychology, science, and knowledge construction: Broadening perspectives from the replication crisis"

  2. Summary of methodology improvements in psychology. Identifies p-hacking as more important than the file-drawer problem.

    Nelson, L. D., Simmons, J., & Simonsohn, U.
    Annual Review of Psychology. (2018, pp. 511-535)
    "Psychology's renaissance"

  3. Improving scientific practice by reducing overconfidence and confirmation bias and by fostering cautious judgments of consistency. Lists unpersuasive arguments for p values.

    Robert J. Caglin-Jageman & Geoff Cumming
    "The new statistics for better science: Ask how much, how uncertain, and what else is known" (11 pages)

  4. It is likely there are articles of interest to you among the other papers in the March 2019 issue of The American Statistician.

