FALSE DISCOVERY RATES IN SOMATIC MUTATION STUDIES OF CANCER

被引:6
|
作者
Trippa, Lorenzo [1 ]
Parmigiani, Giovanni
机构
[1] Dana Farber Canc Inst, Boston, MA 02115 USA
来源
ANNALS OF APPLIED STATISTICS | 2011年 / 5卷 / 2B期
关键词
Cancer genome studies; genome-wide studies; false discovery rate; multiple hypothesis testing; somatic mutations; CONSENSUS CODING SEQUENCES; EMPIRICAL BAYES; HUMAN BREAST; 2-STAGE DESIGNS; ASSOCIATION;
D O I
10.1214/10-AOAS438
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
The purpose of cancer genome sequencing studies is to determine the nature and types of alterations present in a typical cancer and to discover genes mutated at high frequencies. In this article we discuss statistical methods for the analysis of somatic mutation frequency data generated in these studies. We place special emphasis on a two-stage study design introduced by Sjblom et al. [Science 314 (2006) 268-274]. In this context, we describe and compare statistical methods for constructing scores that can be used to prioritize candidate genes for further investigation and to assess the statistical significance of the candidates thus identified. Controversy has surrounded the reliability of the false discovery rates estimates provided by the approximations used in early cancer genome studies. To address these, we develop a semiparametric Bayesian model that provides an accurate fit to the data. We use this model to generate a large collection of realistic scenarios, and evaluate alternative approaches on this collection. Our assessment is impartial in that the model used for generating data is not used by any of the approaches compared. And is objective, in that the scenarios are generated by a model that fits data. Our results quantify the conservative control of the false discovery rate with the Benjamini and Hockberg method compared to the empirical Bayes approach and the multiple testing method proposed in Storey [J. R. Stat. Soc. Ser. B Stat. Methodol. 64 (2002) 479-498]. Simulation results also show a negligible departure from the target false discovery rate for the methodology used in Sjblom et al. [Science 314 (2006) 268-274].
引用
收藏
页码:1360 / 1378
页数:19
相关论文
共 50 条
  • [1] Evolution of somatic mutation rates
    Cagan, Alex
    EUROPEAN JOURNAL OF HUMAN GENETICS, 2024, 32 : 90 - 90
  • [2] PRESM: personalized reference editor for somatic mutation discovery in cancer genomics
    Cao, Chen
    Mak, Lauren
    Jin, Guangxu
    Gordon, Paul
    Ye, Kai
    Long, Quan
    BIOINFORMATICS, 2019, 35 (09) : 1445 - 1452
  • [3] A direct approach to false discovery rates
    Storey, JD
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2002, 64 : 479 - 498
  • [4] False discovery rates in spectral identification
    Jeong, Kyowon
    Kim, Sangtae
    Bandeira, Nuno
    BMC BIOINFORMATICS, 2012, 13
  • [5] False discovery rates: a new deal
    Stephens, Matthew
    BIOSTATISTICS, 2017, 18 (02) : 275 - 294
  • [6] False discovery rates and multiple testing
    Dey S.
    Delampady M.
    Resonance, 2013, 18 (12) : 1095 - 1109
  • [7] False discovery rates for spatial signals
    Benjamini, Ybav
    Heller, Ruth
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2007, 102 (480) : 1272 - 1281
  • [8] False discovery rates in spectral identification
    Kyowon Jeong
    Sangtae Kim
    Nuno Bandeira
    BMC Bioinformatics, 13
  • [9] False Discovery Rates in Biological Networks
    Yu, Lu
    Kaufmann, Tobias
    Lederer, Johannes
    24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS), 2021, 130 : 163 - +
  • [10] Size, power and false discovery rates
    Efron, Bradley
    ANNALS OF STATISTICS, 2007, 35 (04): : 1351 - 1377