dxalxmur.com

Understanding the Issues with P Values: Why Less Than 0.05 Isn't Enough

Written on

Introduction to P Values

The concept of the p value is pivotal in frequentist statistics, often introduced in basic statistics courses. Unfortunately, many of these courses fail to adequately explain the capabilities and limitations of p values, sometimes perpetuating misconceptions about their significance in establishing causal relationships. This has led to misinterpretations among students and researchers alike, particularly when they encounter p values below the 0.05 threshold. The aim of this article is to demystify the p value and inspire data scientists to explore alternative metrics in their analyses.

What Exactly is a P Value?

To grasp the p value, one must first understand Null Hypothesis Significance Testing (NHST). NHST involves formulating two hypotheses: the null hypothesis, which posits no effect of a treatment on a variable, and the alternative hypothesis, which claims that the treatment does have an effect.

For instance, consider an investigation into whether minimum wage laws influence unemployment rates. The hypotheses could be defined as follows:

  • Null hypothesis: Minimum wage laws do not affect unemployment rates.
  • Alternative hypothesis: Minimum wage laws increase unemployment rates.

Researchers typically aim to reject the null hypothesis, as doing so supports the alternative hypothesis. The p value is a statistical tool used to determine whether the null hypothesis can be rejected.

In a hypothetical scenario, after studying a sample of American cities, we find that cities with minimum wage laws have an unemployment rate that is slightly higher, on average, than cities without these laws.

Is this evidence against minimum wage policies? Not necessarily. In academic research, a p value above 0.05 is generally deemed not statistically significant. For example, if our analysis yields a p value of 0.78, it indicates a 78% chance of observing such a difference if the null hypothesis were true. This result would likely be dismissed by academic journals, which often favor results with lower p values.

Statistical analysis of p values

Challenges with P Values

  1. P-Hacking

After receiving feedback that our p value is too high, we may feel inclined to manipulate our analysis to achieve a lower p value. For instance, we might expand our study to include data from other countries, as larger sample sizes tend to decrease p values. We may also adjust our model to control for various factors, thereby complicating it and potentially reducing the p value further. Ultimately, through various methods, we may arrive at a p value of 0.049, making our findings appear statistically significant. This practice, often referred to as "p hacking," can obscure the true nature of the data and produce unreliable results.

  1. Misinterpretation of Truth

Upon achieving a p value of 0.049, researchers might erroneously conclude that they have disproven the null hypothesis. However, a low p value alone does not provide evidence for or against the truth of either hypothesis. The American Statistical Association (ASA) has noted that p values do not measure the probability of a hypothesis being true. Rather, they indicate the probability of observing specific data under the assumption that the null hypothesis is valid.

  1. Weak Effects

Even if we manage to reduce our p value to 0.004, it begs the question: does a statistically significant p value translate to meaningful real-world implications? A small p value can sometimes disguise the triviality of the effect being measured. The ASA emphasizes that conclusions and policy decisions should not rely solely on p values.

What’s Next?

Given the myriad issues surrounding p values, one might wonder whether they should be abandoned altogether. In 2015, the Basic and Applied Social Psychology journal prohibited the reporting of p values. While some argue for the complete elimination of p values, I believe that education on their appropriate use is essential. P values were never meant to be the sole focus of statistical analysis but rather one component among many in evaluating causality. Despite their limitations, p values can still provide valuable insights when used judiciously alongside other statistical methods and an understanding of the context.

Understanding p values in statistical research

Notes

The alternative hypothesis for the minimum wage law example could also have been phrased differently, but the standard formulation is generally preferred in labor economics.

Share the page:

Twitter Facebook Reddit LinkIn

-----------------------

Recent Post:

Exciting Updates with iOS 15.5 Beta 4: Final Release Nears

Discover the key changes in iOS 15.5 Beta 4 as we approach the final release, including UI adjustments and important bug fixes.

13 Time-Saving JavaScript Shorthands for Developers

Discover 13 JavaScript shorthands that enhance code readability and save valuable programming time.

Empowering Women: Why Success Shouldn't Be Limited by Gender

Exploring the societal barriers that women face in achieving success and the importance of self-empowerment.