Why Most Published Research Findings Are False

From HandWiki
Jump to: navigation, search

"Why Most Published Research Findings Are False"[1] is a 2005 research paper written by John Ioannidis, a professor at the Stanford School of Medicine, and published in PLOS Medicine. In the paper, Ioannidis argues that a large number, if not the majority, of published medical research papers contain results that cannot be replicated. The paper is considered foundational to the field of metascience.

Argument

Suppose that in a given scientific field there is a known baseline probability that a result is true, denoted by [math]\displaystyle{ \mathbb{P}(\text{True}) }[/math]. When a study is conducted, the probability that a positive result is obtained is [math]\displaystyle{ \mathbb{P}(+) }[/math]. Given these two factors, we want to compute the conditional probability [math]\displaystyle{ \mathbb{P}(\text{True}\mid +) }[/math], which is known as the positive predictive value (PPV). Bayes' theorem allows us to compute the PPV as:[math]\displaystyle{ \mathbb{P}(\text{True} \mid +) = {(1-\beta)\mathbb{P}(\text{True})\over{(1-\beta)\mathbb{P}(\text{True}) + \alpha\left[1-\mathbb{P}(\text{True})\right]}} }[/math]where [math]\displaystyle{ \alpha }[/math] is the type I error rate and [math]\displaystyle{ \beta }[/math] is the type II error rate; the statistical power is [math]\displaystyle{ 1-\beta }[/math]. It is customary in most scientific research to desire [math]\displaystyle{ \alpha = 0.05 }[/math] and [math]\displaystyle{ \beta = 0.2 }[/math]. If we assume [math]\displaystyle{ \mathbb{P}(\text{True}) = 0.1 }[/math] for a given scientific field, then we may compute the PPV for different values of [math]\displaystyle{ \alpha }[/math] and [math]\displaystyle{ \beta }[/math]:

[math]\displaystyle{ \beta }[/math]
[math]\displaystyle{ \alpha }[/math] 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
0.01 0.91 0.90 0.89 0.87 0.85 0.82 0.77 0.69 0.53
0.02 0.83 0.82 0.80 0.77 0.74 0.69 0.63 0.53 0.36
0.03 0.77 0.75 0.72 0.69 0.65 0.60 0.53 0.43 0.27
0.04 0.71 0.69 0.66 0.63 0.58 0.53 0.45 0.36 0.22
0.05 0.67 0.64 0.61 0.57 0.53 0.47 0.40 0.31 0.18

However, the simple formula for PPV derived from Bayes' theorem does not account for bias in study design or reporting. In the presence of bias [math]\displaystyle{ u\in[0,1] }[/math], the PPV is given by the more general expression:[math]\displaystyle{ \mathbb{P}(\text{True}|+) = {\left[1-(1-u)\beta \right ]\mathbb{P}(\text{True})\over{\left[1-(1-u)\beta \right ]\mathbb{P}(\text{True}) + \left[(1-u)\alpha + u \right ]\left[1-\mathbb{P}(\text{True}) \right ] }} }[/math]The introduction of bias will tend to depress the PPV; in the extreme case when the bias of a study is maximized, [math]\displaystyle{ \mathbb{P}(\text{True}|+) = \mathbb{P}(\text{True}) }[/math]. Even if a study meets the benchmark requirements for [math]\displaystyle{ \alpha }[/math] and [math]\displaystyle{ \beta }[/math], and is free of bias, there is still a 36% probability that a paper reporting a positive result will be incorrect; if the base probability of a true result is lower, then this will push the PPV lower too. Furthermore, there is strong evidence that the average statistical power of a study in many scientific fields is well below the benchmark level of 0.8.[2][3][4]

Given the realities of bias, low statistical power, and a small number of true hypotheses, Ioannidis concludes that the majority of studies in a variety of scientific fields are likely to report results that are false.

Corollaries

In addition to the main result, Ioannidis lists six corollaries for factors that can influence the reliability of published research:

  1. The smaller the studies conducted in a scientific field, the less likely the research findings are to be true.
  2. The smaller the effect sizes in a scientific field, the less likely the research findings are to be true.
  3. The greater the number and the lesser the selection of tested relationships in a scientific field, the less likely the research findings are to be true.
  4. The greater the flexibility in designs, definitions, outcomes, and analytical modes in a scientific field, the less likely the research findings are to be true.
  5. The greater the financial and other interests and prejudices in a scientific field, the less likely the research findings are to be true.
  6. The hotter a scientific field (with more scientific teams involved), the less likely the research findings are to be true.

Influence

Despite initial skepticism about the claims made in the paper, Ioannidis's argument has been accepted by a large number of researchers.[5] The growth of metascience and the recognition of a scientific replication crisis have bolstered the paper's credibility, and led to calls for methodological reforms in scientific research.[6][7]

See also

References

  1. Ioannidis, John P. A. (2005). "Why Most Published Research Findings Are False". PLOS Medicine 2 (8): e124. doi:10.1371/journal.pmed.0020124. ISSN 1549-1277. PMID 16060722. 
  2. Button, Katherine S.; Ioannidis, John P. A.; Mokrysz, Claire; Nosek, Brian A.; Flint, Jonathan; Robinson, Emma S. J.; Munafò, Marcus R. (2013). "Power failure: why small sample size undermines the reliability of neuroscience" (in en). Nature Reviews Neuroscience 14 (5): 365–376. doi:10.1038/nrn3475. ISSN 1471-0048. PMID 23571845. https://www.nature.com/articles/nrn3475. 
  3. Szucs, Denes; Ioannidis, John P. A. (2017-03-02). "Empirical assessment of published effect sizes and power in the recent cognitive neuroscience and psychology literature" (in en). PLOS Biology 15 (3): e2000797. doi:10.1371/journal.pbio.2000797. ISSN 1545-7885. PMID 28253258. 
  4. Ioannidis, John P. A.; Stanley, T. D.; Doucouliagos, Hristos (2017). "The Power of Bias in Economics Research" (in en). The Economic Journal 127 (605): F236–F265. doi:10.1111/ecoj.12461. ISSN 1468-0297. 
  5. Belluz, Julia (2015-02-16). "John Ioannidis has dedicated his life to quantifying how science is broken" (in en). https://www.vox.com/2015/2/16/8034143/john-ioannidis-interview. 
  6. "Low power and the replication crisis: What have we learned since 2004 (or 1984, or 1964)? « Statistical Modeling, Causal Inference, and Social Science" (in en-US). https://statmodeling.stat.columbia.edu/2018/02/18/low-power-replication-crisis-learned-since-2004-1984-1964/. 
  7. Wasserstein, Ronald L.; Lazar, Nicole A. (2016-04-02). "The ASA Statement on p-Values: Context, Process, and Purpose". The American Statistician 70 (2): 129–133. doi:10.1080/00031305.2016.1154108. ISSN 0003-1305. 

Further reading

External links