Adaptive Testing for High-Dimensional Data

被引:0
|
作者
Zhang, Yangfan [1 ]
Wang, Runmin [2 ]
Shao, Xiaofeng [3 ]
机构
[1] Two Sigma Investments, New York, NY USA
[2] Texas A&M Univ, Dept Stat, 3143 TAMU, College Stn, TX 77843 USA
[3] Univ Illinois, Dept Stat, Champaign, IL USA
关键词
Independence testing; Simultaneous testing; Spatial sign; U-statistics; HIGHER CRITICISM; COVARIANCE-MATRIX; 2-SAMPLE TEST; ASYMPTOTIC DISTRIBUTIONS; U-STATISTICS; INDEPENDENCE; COHERENCE; SIGNALS; ANOVA;
D O I
10.1080/01621459.2024.2439617
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
In this article, we propose a class of L-q -norm based U-statistics for a family of global testing problems related to high-dimensional data. This includes testing of mean vector and its spatial sign, simultaneous testing of linear model coefficients, and testing of component-wise independence for high-dimensional observations, among others. Under the null hypothesis, we derive asymptotic normality and independence between L-q -norm based U-statistics for several qs under mild moment and cumulant conditions. A simple combination of two studentized L-q -based test statistics via their p-values is proposed and is shown to attain great power against alternatives of different sparsity. Our work is a substantial extension of He et al., which is mostly focused on mean and covariance testing, and we manage to provide a general treatment of asymptotic independence of L-q -norm based U-statistics for a wide class of kernels. To alleviate the computation burden, we introduce a variant of the proposed U-statistics by using the monotone indices in the summation, resulting in a U-statistic with asymmetric kernel. A dynamic programming method is introduced to reduce the computational cost from O(n(qr)) , which is required for the calculation of the full U-statistic, to O(n (R)) where r is the order of the kernel. Numerical results further corroborate the advantage of the proposed adaptive test as compared to some existing competitors. Supplementary materials for this article are available online, including a standardized description of the materials available for reproducing the work.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Multiple testing for high-dimensional data
    Diao, Guoqing
    Hanlon, Bret
    Vidyashankar, Anand N.
    PERSPECTIVES ON BIG DATA ANALYSIS: METHODOLOGIES AND APPLICATIONS, 2014, 622 : 95 - 108
  • [2] TESTING FOR GROUP STRUCTURE IN HIGH-DIMENSIONAL DATA
    McLachlan, G. J.
    Rathnayake, Suren I.
    JOURNAL OF BIOPHARMACEUTICAL STATISTICS, 2011, 21 (06) : 1113 - 1125
  • [3] Testing the Mean Vector for High-Dimensional Data
    Shi, Gongming
    Lin, Nan
    Zhang, Baoxue
    COMMUNICATIONS IN MATHEMATICS AND STATISTICS, 2024,
  • [4] Adaptive Clustering for Outlier Identification in High-Dimensional Data
    Thudumu, Srikanth
    Branch, Philip
    Jin, Jiong
    Singh, Jugdutt
    ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP 2019, PT II, 2020, 11945 : 215 - 228
  • [5] ADAPTIVE CHANGE POINT MONITORING FOR HIGH-DIMENSIONAL DATA
    Wu, Teng
    Wang, Runmin
    Yan, Hao
    Shao, Xiaofeng
    STATISTICA SINICA, 2022, 32 (03) : 1583 - 1610
  • [6] Adaptive Bayesian density regression for high-dimensional data
    Shen, Weining
    Ghosal, Subhashis
    BERNOULLI, 2016, 22 (01) : 396 - 420
  • [7] Adaptive Dimensionality Reduction Method for High-dimensional Data
    Duan, Shuyong
    Yang, Jianhua
    Han, Xu
    Liu, Guirong
    Jixie Gongcheng Xuebao/Journal of Mechanical Engineering, 2024, 60 (17): : 283 - 296
  • [8] Adaptive Testing for Alphas in High-Dimensional Factor Pricing Models
    Xia, Qiang
    Zhang, Xianyang
    JOURNAL OF BUSINESS & ECONOMIC STATISTICS, 2024, 42 (02) : 640 - 653
  • [9] Testing the Mean Matrix in High-Dimensional Transposable Data
    Touloumis, Anestis
    Tavare, Simon
    Marioni, John C.
    BIOMETRICS, 2015, 71 (01) : 157 - 166
  • [10] Testing independence in high-dimensional multivariate normal data
    Najarzadeh, D.
    COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2021, 50 (14) : 3421 - 3435