SELECTING THE NUMBER OF PRINCIPAL COMPONENTS: ESTIMATION OF THE TRUE RANK OF A NOISY MATRIX

被引:46
|
作者
Choi, Yunjin [1 ]
Taylor, Jonathan [2 ]
Tibshirani, Robert [3 ]
机构
[1] Natl Univ Singapore, Dept Stat & Appl Probabil, Block S16,Level 6,Sci Dr 2, Singapore 117546, Singapore
[2] Stanford Univ, Dept Stat, 390 Serra Mall, Stanford, CA 94305 USA
[3] Stanford Univ, Dept Hlth Res & Policy, 390 Serra Mall, Stanford, CA 94305 USA
来源
ANNALS OF STATISTICS | 2017年 / 45卷 / 06期
关键词
Principal components; hypothesis test; exact p-value; REGRESSION;
D O I
10.1214/16-AOS1536
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Principal component analysis (PCA) is a well-known tool in multivariate statistics. One significant challenge in using PCA is the choice of the number of principal components. In order to address this challenge, we propose distribution-based methods with exact type 1 error controls for hypothesis testing and construction of confidence intervals for signals in a noisy matrix with finite samples. Assuming Gaussian noise, we derive exact type 1 error controls based on the conditional distribution of the singular values of a Gaussian matrix by utilizing a post-selection inference framework, and extending the approach of [Taylor, Loftus and Tibshirani (2013)] in a PCA setting. In simulation studies, we find that our proposed methods compare well to existing approaches.
引用
收藏
页码:2590 / 2617
页数:28
相关论文
共 50 条
  • [31] Inference of principal components of noisy correlation matrices with prior information
    Monasson, Remi
    2016 50TH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS, 2016, : 95 - 99
  • [32] ESTIMATION OF TRUE NUMBER OF CONGENITAL MALFORMATIONS
    HAY, S
    MACKEPRANG, M
    PEDIATRICS, 1971, 47 (06) : 1094 - +
  • [33] Stepwise estimation of common principal components
    Trendafilov, Nickolay T.
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2010, 54 (12) : 3446 - 3457
  • [34] The Sparse Principal Component of a Constant-Rank Matrix
    Asteris, Megasthenis
    Papailiopoulos, Dimitris S.
    Karystinos, George N.
    IEEE TRANSACTIONS ON INFORMATION THEORY, 2014, 60 (04) : 2281 - 2290
  • [35] The principal rank characteristic sequence of a real symmetric matrix
    Brualdi, R. A.
    Deaett, L.
    Olesky, D. D.
    van den Driessche, P.
    LINEAR ALGEBRA AND ITS APPLICATIONS, 2012, 436 (07) : 2137 - 2155
  • [36] PRINCIPAL COMPONENT ANALYSIS WITH DROP RANK COVARIANCE MATRIX
    Guo, Yitong
    Ling, Bingo Wing-Kuen
    JOURNAL OF INDUSTRIAL AND MANAGEMENT OPTIMIZATION, 2021, 17 (05) : 2345 - 2366
  • [37] Sparse Principal Component of a Rank-deficient Matrix
    Asteris, Megasthenis
    Papailiopoulos, Dimitris S.
    Karystinos, George N.
    2011 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY PROCEEDINGS (ISIT), 2011, : 673 - 677
  • [38] Determining the number of principal components for best reconstruction
    Qin, SJ
    Dunia, R
    DYNAMICS & CONTROL OF PROCESS SYSTEMS 1998, VOLUMES 1 AND 2, 1999, : 357 - 362
  • [39] Feasible model-based principal component analysis: Joint estimation of rank and error covariance matrix
    Chan, Tak-Shing T.
    Gibberd, Alex
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2025, 201