Heteroscedasticity-Adjusted Ranking and Thresholding for Large-Scale Multiple Testing

被引:2
|
作者
Fu, Luella [1 ]
Gang, Bowen [2 ]
James, Gareth M. [3 ]
Sun, Wenguang [3 ]
机构
[1] San Francisco State Univ, Dept Math, San Francisco, CA 94132 USA
[2] Fudan Univ, Dept Stat, Shanghai, Peoples R China
[3] Univ Southern Calif, Dept Data Sci & Operat, Los Angeles, CA 90089 USA
关键词
Covariate-assisted inference; Data processing and information loss; False discovery rate; Heteroscedasticity; Multiple testing with side information; Structured multiple testing; FALSE-DISCOVERY RATE; GENE-EXPRESSION; EMPIRICAL BAYES; POWER; HYPOTHESES; NULL; MICROARRAYS;
D O I
10.1080/01621459.2020.1840992
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Standardization has been a widely adopted practice in multiple testing, for it takes into account the variability in sampling and makes the test statistics comparable across different study units. However, despite conventional wisdom to the contrary, we show that there can be a significant loss in information from basing hypothesis tests on standardized statistics rather than the full data. We develop a new class of heteroscedasticity-adjusted ranking and thresholding (HART) rules that aim to improve existing methods by simultaneously exploiting commonalities and adjusting heterogeneities among the study units. The main idea of HART is to bypass standardization by directly incorporating both the summary statistic and its variance into the testing procedure. A key message is that the variance structure of the alternative distribution, which is subsumed under standardized statistics, is highly informative and can be exploited to achieve higher power. The proposed HART procedure is shown to be asymptotically valid and optimal for false discovery rate (FDR) control. Our simulation results demonstrate that HART achieves substantial power gain over existing methods at the same FDR level. We illustrate the implementation through a microarray analysis of myeloma.
引用
收藏
页码:1028 / 1040
页数:13
相关论文
共 50 条
  • [21] Post hoc power estimation in large-scale multiple testing problems
    Zehetmayer, Sonja
    Posch, Martin
    BIOINFORMATICS, 2010, 26 (08) : 1050 - 1056
  • [22] Covariate-modulated large-scale multiple testing under dependence
    Wang, Jiangzhou
    Cui, Tingting
    Zhu, Wensheng
    Wang, Pengfei
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2023, 180
  • [23] Large-scale multiple testing via multivariate hidden Markov models
    Hou, Zhiqiang
    Wang, Pengfei
    COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2024, 53 (04) : 1932 - 1951
  • [24] Bayesian hidden Markov models for dependent large-scale multiple testing
    Wang, Xia
    Shojaie, Ali
    Zou, Jian
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2019, 136 : 123 - 136
  • [25] Contextual Ranking of Behaviors for Large-scale Multiagent Simulations
    Parikh, Nidhi
    Marathe, Madhav V.
    Swarup, Samarth
    AAMAS'17: PROCEEDINGS OF THE 16TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2017, : 1676 - 1678
  • [26] Ranking of closeness centrality for large-scale social networks
    Okamoto, Kazuya
    Chen, Wei
    Li, Xiang-Yang
    FRONTIERS IN ALGORITHMICS, 2008, 5059 : 186 - +
  • [27] RankRC: Large-Scale Nonlinear Rare Class Ranking
    Tayal, Aditya
    Coleman, Thomas F.
    Li, Yuying
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2015, 27 (12) : 3347 - 3359
  • [28] LARGE-SCALE RANKING AND SELECTION USING CLOUD COMPUTING
    Luo, Jun
    Hong, L. Jeff
    PROCEEDINGS OF THE 2011 WINTER SIMULATION CONFERENCE (WSC), 2011, : 4046 - 4056
  • [29] Unsupervised Domain Ranking in Large-Scale Web Crawls
    Cui, Yi
    Sparkman, Clint
    Lee, Hsin-Tsang
    Loguinov, Dmitri
    ACM TRANSACTIONS ON THE WEB, 2018, 12 (04)
  • [30] A Novel Ranking Model for a Large-Scale Scientific Publication
    Sohn, Bong-Soo
    Jung, Jai E.
    MOBILE NETWORKS & APPLICATIONS, 2015, 20 (04): : 508 - 520