Adaptive Testing for High-Dimensional Data

被引:0
|
作者
Zhang, Yangfan [1 ]
Wang, Runmin [2 ]
Shao, Xiaofeng [3 ]
机构
[1] Two Sigma Investments, New York, NY USA
[2] Texas A&M Univ, Dept Stat, 3143 TAMU, College Stn, TX 77843 USA
[3] Univ Illinois, Dept Stat, Champaign, IL USA
关键词
Independence testing; Simultaneous testing; Spatial sign; U-statistics; HIGHER CRITICISM; COVARIANCE-MATRIX; 2-SAMPLE TEST; ASYMPTOTIC DISTRIBUTIONS; U-STATISTICS; INDEPENDENCE; COHERENCE; SIGNALS; ANOVA;
D O I
10.1080/01621459.2024.2439617
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
In this article, we propose a class of L-q -norm based U-statistics for a family of global testing problems related to high-dimensional data. This includes testing of mean vector and its spatial sign, simultaneous testing of linear model coefficients, and testing of component-wise independence for high-dimensional observations, among others. Under the null hypothesis, we derive asymptotic normality and independence between L-q -norm based U-statistics for several qs under mild moment and cumulant conditions. A simple combination of two studentized L-q -based test statistics via their p-values is proposed and is shown to attain great power against alternatives of different sparsity. Our work is a substantial extension of He et al., which is mostly focused on mean and covariance testing, and we manage to provide a general treatment of asymptotic independence of L-q -norm based U-statistics for a wide class of kernels. To alleviate the computation burden, we introduce a variant of the proposed U-statistics by using the monotone indices in the summation, resulting in a U-statistic with asymmetric kernel. A dynamic programming method is introduced to reduce the computational cost from O(n(qr)) , which is required for the calculation of the full U-statistic, to O(n (R)) where r is the order of the kernel. Numerical results further corroborate the advantage of the proposed adaptive test as compared to some existing competitors. Supplementary materials for this article are available online, including a standardized description of the materials available for reproducing the work.
引用
收藏
页数:13
相关论文
共 50 条
  • [31] Adaptive multi-view subspace clustering for high-dimensional data
    Yan, Fei
    Wang, Xiao-dong
    Zeng, Zhi-qiang
    Hong, Chao-qun
    PATTERN RECOGNITION LETTERS, 2020, 130 : 299 - 305
  • [32] A rank-based adaptive independence test for high-dimensional data
    Shi, Xiangyu
    Cao, Ruiyuan
    Du, Jiang
    Miao, Zhuqing
    COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2024,
  • [33] Adaptive threshold-based classification of sparse high-dimensional data
    Pavlenko, Tatjana
    Stepanova, Natalia
    Thompson, Lee
    ELECTRONIC JOURNAL OF STATISTICS, 2022, 16 (01): : 1952 - 1996
  • [34] Testing block-diagonal covariance structure for high-dimensional data
    Hyodo, Masashi
    Shutoh, Nobumichi
    Nishiyama, Takahiro
    Pavlenko, Tatjana
    STATISTICA NEERLANDICA, 2015, 69 (04) : 460 - 482
  • [35] Simultaneous testing of mean vector and covariance matrix for high-dimensional data
    Liu, Zhongying
    Liu, Baisen
    Zheng, Shurong
    Shi, Ning-Zhong
    JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2017, 188 : 82 - 93
  • [36] A framework for paired-sample hypothesis testing for high-dimensional data
    Bargiotas, Ioannis
    Kalogeratos, Argyris
    Vayatis, Nicolas
    2023 IEEE 35TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, ICTAI, 2023, : 16 - 21
  • [38] Testing for association between RNA-Seq and high-dimensional data
    Rauschenberger, Armin
    Jonker, Marianne A.
    van de Wiel, Mark A.
    Menezes, Renee X.
    BMC BIOINFORMATICS, 2016, 17
  • [39] Testing for association between RNA-Seq and high-dimensional data
    Armin Rauschenberger
    Marianne A. Jonker
    Mark A. van de Wiel
    Renée X. Menezes
    BMC Bioinformatics, 17
  • [40] High-dimensional Online Adaptive Filtering
    Yasini, Sholeh
    Pelckmans, Kristiaan
    IFAC PAPERSONLINE, 2017, 50 (01): : 14106 - 14111