Large-scale dependent multiple testing via hidden semi-Markov models

被引:0
|
作者
Wang, Jiangzhou [1 ]
Wang, Pengfei [2 ]
机构
[1] Shenzhen Univ, Inst Stat Sci, Coll Math & Stat, Shenzhen 518060, Peoples R China
[2] Dongbei Univ Finance & Econ, Sch Stat, Dalian 116025, Peoples R China
关键词
FDR; Hidden semi-Markov model; Multiple testing; FALSE DISCOVERY RATE; EMPIRICAL BAYES;
D O I
10.1007/s00180-023-01367-z
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Large-scale multiple testing is common in the statistical analysis of high-dimensional data. Conventional multiple testing procedures usually implicitly assumed that the tests are independent. However, this assumption is rarely established in many practical applications, particularly in "high-throughput" data analysis. Incorporating dependence structure information among tests can improve statistical power and interpretability of discoveries. In this paper, we propose a new large-scale dependent multiple testing procedure based on the hidden semi-Markov model (HSMM), which characterizes local correlations among tests using a semi-Markov process instead of a first-order Markov chain. Our novel approach allows for the number of consecutive null hypotheses to follow any reasonable distribution, enabling a more accurate description of complex local correlations. We show that the proposed procedure minimizes the marginal false non-discovery rate (mFNR) at the same marginal false discovery rate (mFDR) level. To reduce the computational complexity of the HSMM, we make use of the hidden Markov model (HMM) with an expanded state space to approximate it. We provide a forward-backward algorithm and an expectation-maximization (EM) algorithm for implementing the proposed procedure. Finally, we demonstrate the superior performance of the SMLIS procedure through extensive simulations and a real data analysis.
引用
收藏
页码:1093 / 1126
页数:34
相关论文
共 50 条
  • [21] A Spectral Algorithm for Inference in Hidden semi-Markov Models
    Melnyk, Igor
    Banerjee, Arindam
    JOURNAL OF MACHINE LEARNING RESEARCH, 2017, 18
  • [22] Maximum likelihood estimation for hidden semi-Markov models
    Barbu, V
    Limnios, N
    COMPTES RENDUS MATHEMATIQUE, 2006, 342 (03) : 201 - 205
  • [23] Nonhomogeneous hidden semi-Markov models for toroidal data
    Lagona, Francesco
    Mingione, Marco
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES C-APPLIED STATISTICS, 2024,
  • [24] hhsmm: an R package for hidden hybrid Markov/semi-Markov models
    Morteza Amini
    Afarin Bayat
    Reza Salehian
    Computational Statistics, 2023, 38 : 1283 - 1335
  • [25] Scalable Bayesian Inference for Coupled Hidden Markov and Semi-Markov Models
    Touloupou, Panayiota
    Finkenstadt, Barbel
    Spencer, Simon E. F.
    JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2020, 29 (02) : 238 - 249
  • [26] hhsmm: an R package for hidden hybrid Markov/semi-Markov models
    Amini, Morteza
    Bayat, Afarin
    Salehian, Reza
    COMPUTATIONAL STATISTICS, 2023, 38 (03) : 1283 - 1335
  • [27] Online Tool Wear Monitoring Via Hidden Semi-Markov Model With Dependent Durations
    Zhu, Kunpeng
    Liu, Tongshun
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2018, 14 (01) : 69 - 78
  • [28] Hidden hybrid Markov/semi-Markov chains
    Guédon, Y
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2005, 49 (03) : 663 - 688
  • [29] hsmm - An R package for analyzing hidden semi-Markov models
    Bulla, Jan
    Bulla, Ingo
    Nenadic, Oleg
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2010, 54 (03) : 611 - 619
  • [30] Optimal Detection and Error Exponents for Hidden Semi-Markov Models
    Bajovic, Dragana
    He, Kanghang
    Stankovic, Lina
    Vukobratovic, Dejan
    Stankovic, Vladimir
    IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2018, 12 (05) : 1077 - 1092