SELECTING THE NUMBER OF PRINCIPAL COMPONENTS: ESTIMATION OF THE TRUE RANK OF A NOISY MATRIX

被引:46
|
作者
Choi, Yunjin [1 ]
Taylor, Jonathan [2 ]
Tibshirani, Robert [3 ]
机构
[1] Natl Univ Singapore, Dept Stat & Appl Probabil, Block S16,Level 6,Sci Dr 2, Singapore 117546, Singapore
[2] Stanford Univ, Dept Stat, 390 Serra Mall, Stanford, CA 94305 USA
[3] Stanford Univ, Dept Hlth Res & Policy, 390 Serra Mall, Stanford, CA 94305 USA
来源
ANNALS OF STATISTICS | 2017年 / 45卷 / 06期
关键词
Principal components; hypothesis test; exact p-value; REGRESSION;
D O I
10.1214/16-AOS1536
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Principal component analysis (PCA) is a well-known tool in multivariate statistics. One significant challenge in using PCA is the choice of the number of principal components. In order to address this challenge, we propose distribution-based methods with exact type 1 error controls for hypothesis testing and construction of confidence intervals for signals in a noisy matrix with finite samples. Assuming Gaussian noise, we derive exact type 1 error controls based on the conditional distribution of the singular values of a Gaussian matrix by utilizing a post-selection inference framework, and extending the approach of [Taylor, Loftus and Tibshirani (2013)] in a PCA setting. In simulation studies, we find that our proposed methods compare well to existing approaches.
引用
收藏
页码:2590 / 2617
页数:28
相关论文
共 50 条
  • [41] Determining the number of principal components for best reconstruction
    Qin, SJ
    Dunia, R
    JOURNAL OF PROCESS CONTROL, 2000, 10 (2-3) : 245 - 250
  • [42] Efficient R-Estimation of Principal and Common Principal Components
    Hallin, Marc
    Paindaveine, Davy
    Verdebout, Thomas
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2014, 109 (507) : 1071 - 1083
  • [43] Rank selection in noisy PCA with sure and Random Matrix Theory
    Ulfarsson, M. O.
    Solo, V.
    2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 3317 - +
  • [44] THE NIPALS ALGORITHM FOR THE CALCULATION OF THE PRINCIPAL COMPONENTS OF A MATRIX
    VANDEGINSTE, BGM
    SIELHORST, C
    GERRITSEN, M
    TRAC-TRENDS IN ANALYTICAL CHEMISTRY, 1988, 7 (08) : 286 - 287
  • [45] An effective method for selecting the number of components in density mixtures
    Whitaker, S.
    Lee, T. C. M.
    JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2007, 77 (10) : 907 - 914
  • [46] Optimal rank-based tests for Common Principal Components
    Hallin, Marc
    Paindaveine, Davy
    Verdebout, Thomas
    BERNOULLI, 2013, 19 (5B) : 2524 - 2556
  • [47] MINIMIZATION OF EIGENVALUES OF A MATRIX AND OPTIMALITY OF PRINCIPAL COMPONENTS
    OKAMOTO, M
    KANAZAWA, M
    ANNALS OF MATHEMATICAL STATISTICS, 1968, 39 (03): : 859 - &
  • [48] Precision Matrix Estimation with Noisy and Missing Data
    Fan, Roger
    Jang, Byoungwook
    Sun, Yuekai
    Zhou, Shuheng
    22ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 89, 2019, 89
  • [49] MINIMIZATION OF EIGENVALUES OF A MATRIX AND OPTIMALITY OF PRINCIPAL COMPONENTS
    OKAMOTO, M
    KANZAWA, M
    ANNALS OF MATHEMATICAL STATISTICS, 1967, 38 (06): : 1935 - &
  • [50] Robust rank-one matrix completion with rank estimation
    Li, Ziheng
    Nie, Feiping
    Wang, Rong
    Li, Xuelong
    PATTERN RECOGNITION, 2023, 142