Large-scale dependent multiple testing via hidden semi-Markov models

被引:0
|
作者
Wang, Jiangzhou [1 ]
Wang, Pengfei [2 ]
机构
[1] Shenzhen Univ, Inst Stat Sci, Coll Math & Stat, Shenzhen 518060, Peoples R China
[2] Dongbei Univ Finance & Econ, Sch Stat, Dalian 116025, Peoples R China
关键词
FDR; Hidden semi-Markov model; Multiple testing; FALSE DISCOVERY RATE; EMPIRICAL BAYES;
D O I
10.1007/s00180-023-01367-z
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Large-scale multiple testing is common in the statistical analysis of high-dimensional data. Conventional multiple testing procedures usually implicitly assumed that the tests are independent. However, this assumption is rarely established in many practical applications, particularly in "high-throughput" data analysis. Incorporating dependence structure information among tests can improve statistical power and interpretability of discoveries. In this paper, we propose a new large-scale dependent multiple testing procedure based on the hidden semi-Markov model (HSMM), which characterizes local correlations among tests using a semi-Markov process instead of a first-order Markov chain. Our novel approach allows for the number of consecutive null hypotheses to follow any reasonable distribution, enabling a more accurate description of complex local correlations. We show that the proposed procedure minimizes the marginal false non-discovery rate (mFNR) at the same marginal false discovery rate (mFDR) level. To reduce the computational complexity of the HSMM, we make use of the hidden Markov model (HMM) with an expanded state space to approximate it. We provide a forward-backward algorithm and an expectation-maximization (EM) algorithm for implementing the proposed procedure. Finally, we demonstrate the superior performance of the SMLIS procedure through extensive simulations and a real data analysis.
引用
收藏
页码:1093 / 1126
页数:34
相关论文
共 50 条
  • [1] Large-scale dependent multiple testing via hidden semi-Markov models
    Jiangzhou Wang
    Pengfei Wang
    Computational Statistics, 2024, 39 : 1093 - 1126
  • [2] Bayesian hidden Markov models for dependent large-scale multiple testing
    Wang, Xia
    Shojaie, Ali
    Zou, Jian
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2019, 136 : 123 - 136
  • [3] Large-scale multiple testing via multivariate hidden Markov models
    Hou, Zhiqiang
    Wang, Pengfei
    COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2024, 53 (04) : 1932 - 1951
  • [4] Large-scale dependent multiple testing via higher-order hidden Markov models
    Li, Canhui
    Wang, Jiangzhou
    Wang, Pengfei
    JOURNAL OF BIOPHARMACEUTICAL STATISTICS, 2024,
  • [5] Large-scale spatial variability of rainfall through hidden semi-Markov models of breakpoint data
    Sansom, J
    JOURNAL OF GEOPHYSICAL RESEARCH-ATMOSPHERES, 1999, 104 (D24) : 31631 - 31643
  • [6] Hidden semi-Markov models
    Yu, Shun-Zheng
    ARTIFICIAL INTELLIGENCE, 2010, 174 (02) : 215 - 243
  • [7] Semi-parametric hidden Markov model for large-scale multiple testing under dependency
    Kim, Joungyoun
    Lim, Johan
    Lee, Jong Soo
    STATISTICAL MODELLING, 2024, 24 (04) : 320 - 343
  • [8] Feature Selection for Hidden Markov Models and Hidden Semi-Markov Models
    Adams, Stephen
    Beling, Peter A.
    Cogill, Randy
    IEEE ACCESS, 2016, 4 : 1642 - 1657
  • [9] A Large-Scale Hidden Semi-Markov Model for Anomaly Detection on User Browsing Behaviors
    Xie, Yi
    Yu, Shun-Zheng
    IEEE-ACM TRANSACTIONS ON NETWORKING, 2009, 17 (01) : 54 - 65
  • [10] Large-scale event detection using semi-hidden Markov models
    Hongeng, S
    Nevatia, R
    NINTH IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION, VOLS I AND II, PROCEEDINGS, 2003, : 1455 - 1462