The distribution of P-values in medical research articles suggested selective reporting associated with statistical significance

被引:32
|
作者
Perneger, Thomas V. [1 ]
Combescure, Christophe
机构
[1] Univ Geneva, Fac Med, Div Clin Epidemiol, 6 Rue Gabrielle Perret Gentil, CH-1211 Geneva, Switzerland
关键词
Statistical tests; P-values; Publication bias; Practice of research; SCIENCE-WISE FALSE; DISCOVERY RATE; PUBLICATION; INFERENCES; ABSTRACTS;
D O I
10.1016/j.jclinepi.2017.04.003
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Objectives: Published P-values provide a window into the global enterprise of medical research. The aim of this study was to use the distribution of published P-values to estimate the relative frequencies of null and alternative hypotheses and to seek irregularities suggestive of publication bias. Study Design and Setting: This cross-sectional study included P-values published in 120 medical research articles in 2016 (30 each from the BMJ, JAMA, Lancet, and New England Journal of Medicine). The observed distribution of P-values was compared with expected distributions under the null hypothesis (i.e., uniform between 0 and 1) and the alternative hypothesis (strictly decreasing from 0 to 1). P-values were categorized according to conventional levels of statistical significance and in one-percent intervals. Results: Among 4,158 recorded P-values, 26.1% were highly significant (P < 0.001), 9.1% were moderately significant (P > 0.001 to < 0.01), 11.7% were weakly significant (P >= 0.01 to < 0.05), and 53.2% were nonsignificant (P >= 0.05). We noted three irregularities: (1) high proportion of P-values <0.001, especially in observational studies, (2) excess of P-values equal to 1, and (3) about twice as many P-values less than 0.05 compared with those more than 0.05. The latter finding was seen in both randomized trials and observational studies, and in most types of analyses, excepting heterogeneity tests and interaction tests. Under plausible assumptions, we estimate that about half of the tested hypotheses were null and the other half were alternative. Conclusion: This analysis suggests that statistical tests published in medical journals are not a random sample of null and alternative hypotheses but that selective reporting is prevalent. In particular, significant results are about twice as likely to be reported as nonsignificant results. (C) 2017 Elsevier Inc. All rights reserved.
引用
收藏
页码:70 / 77
页数:8
相关论文
共 24 条
  • [21] A new statistical approach to combining p-values using gamma distribution and its application to genome-wide association study
    Zhongxue Chen
    William Yang
    Qingzhong Liu
    Jack Y Yang
    Jing Li
    Mary Qu Yang
    BMC Bioinformatics, 15
  • [22] Why do we read many articles with bad statistics? : what does the new American Statistical Association's statement on p-values mean?
    Lee, Sangseok
    KOREAN JOURNAL OF ANESTHESIOLOGY, 2016, 69 (02) : 109 - +
  • [23] A new statistical approach to combining p-values using gamma distribution and its application to genome-wide association study
    Chen, Zhongxue
    Yang, William
    Liu, Qingzhong
    Yang, Jack Y.
    Li, Jing
    Yang, Mary Qu
    BMC BIOINFORMATICS, 2014, 15
  • [24] Blinded by the Light: How a Focus on Statistical "Significance" May Cause p-Value Misreporting and an Excess of p-Values Just Below.05 in Communication Science
    Vermeulen, Ivar
    Beukeboom, Camiel J.
    Batenburg, Anika
    Avramiea, Arthur
    Stoyanov, Dimo
    van de Velde, Bob
    Oegema, Dirk
    COMMUNICATION METHODS AND MEASURES, 2015, 9 (04) : 253 - 279