Selecting the number of factors in principal component analysis by permutation testingNumerical and practical aspects

被引:21
|
作者
Vitale, Raffaele [1 ,2 ,3 ]
Westerhuis, Johan A. [4 ]
Naes, Tormod [5 ]
Smilde, Age K. [4 ]
de Noord, Onno E. [6 ]
Ferrer, Alberto [1 ]
机构
[1] Univ Politecn Valencia, Dept Estadist & Invest Operat Aplicadas & Calidad, Grp Ingn Estadist Multivariante, Camino Vera S-N, E-46022 Valencia, Spain
[2] Katholieke Univ Leuven, Dept Chem, Mol Imaging & Photon Unit, Celestijnenlaan 200F, B-3001 Leuven, Belgium
[3] Univ Lille Sci & Technol, Lab Spectrochim Infrarouge & Raman, UMR 8516, Batiment C5, F-59655 Villeneuve Dascq, France
[4] Univ Amsterdam, Swammerdam Inst Life Sci, Biosyst Data Anal, Sci Pk 904, NL-1098 XH Amsterdam, Netherlands
[5] Nofima AS, N-1431 As, Norway
[6] Shell Global Solut Int BV, Shell Technol Ctr Amsterdam, NL-1030 BN Amsterdam, Netherlands
关键词
deflation; eigenvalues; permutation testing; principal component analysis (PCA); projection; HORNS PARALLEL ANALYSIS; CROSS-VALIDATION; PCA MODELS; ALGORITHM; VARIABLES; MATRIX; RETAIN; RULES;
D O I
10.1002/cem.2937
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Selecting the correct number of factors in principal component analysis (PCA) is a critical step to achieve a reasonable data modelling, where the optimal strategy strictly depends on the objective PCA is applied for. In the last decades, much work has been devoted to methods like Kaiser's eigenvalue greater than 1 rule, Velicer's minimum average partial rule, Cattell's scree test, Bartlett's chi-square test, Horn's parallel analysis, and cross-validation. However, limited attention has been paid to the possibility of assessing the significance of the calculated components via permutation testing. That may represent a feasible approach in case the focus of the study is discriminating relevant from nonsystematic sources of variation and/or the aforementioned methodologies cannot be resorted to (eg, when the analysed matrices do not fulfill specific properties or statistical assumptions). The main aim of this article is to provide practical insights for an improved understanding of permutation testing, highlighting its pros and cons, mathematically formalising the numerical procedure to be abided by when applying it for PCA factor selection by the description of a novel algorithm developed to this end, and proposing ad hoc solutions for optimising computational time and efficiency.
引用
收藏
页数:15
相关论文
共 50 条
  • [21] Radar emitter number estimation based on principal component analysis
    Chen, Taowei
    Jin, Weidong
    Xinan Jiaotong Daxue Xuebao/Journal of Southwest Jiaotong University, 2009, 44 (04): : 501 - 506
  • [22] A DECISION PROCEDURE FOR DETERMINING THE NUMBER OF COMPONENTS IN PRINCIPAL COMPONENT ANALYSIS
    HUANG, DY
    TSENG, ST
    JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 1992, 30 (01) : 63 - 71
  • [23] On the estimation of the number of components in multivariate functional principal component analysis
    Golovkine, Steven
    Gunning, Edward
    Simpkin, Andrew J.
    Bargary, Norma
    COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2025,
  • [24] Identification of Excitation Source Number using Principal Component Analysis
    Dong, Jian Chao
    Yang, Tie Jun
    Li, Xin Hui
    Shuai, Zhi Jun
    Xiao, You Hong
    ADVANCES IN MECHANICAL DESIGN, PTS 1 AND 2, 2011, 199-200 : 850 - 857
  • [25] Variable Neighborhood Search Heuristics for Selecting a Subset of Variables in Principal Component Analysis
    Brusco, Michael J.
    Singh, Renu
    Steinley, Douglas
    PSYCHOMETRIKA, 2009, 74 (04) : 705 - 726
  • [26] Variable Neighborhood Search Heuristics for Selecting a Subset of Variables in Principal Component Analysis
    Michael J. Brusco
    Renu Singh
    Douglas Steinley
    Psychometrika, 2009, 74 : 705 - 726
  • [27] Selecting the Top-k Discriminative Features Using Principal Component Analysis
    Kane, Aminata
    Shiri, Nematollaah
    2016 IEEE 16TH INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW), 2016, : 639 - 646
  • [28] Factors determining farmers' progressiveness: A principal component analysis
    Kumar, Rakesh K.
    Paul, Sudipta
    Singh, Premlata
    Chahal, V. P.
    INDIAN JOURNAL OF AGRICULTURAL SCIENCES, 2015, 85 (08): : 1026 - 1029
  • [29] Principal Component Analysis of Electricity Consumption Factors in China
    Zhang, Jing
    Yang, Xin-yao
    Shen, Fei
    Li, Yuan-wei
    Xiao, Hong
    Qi, Hui
    Peng, Hong
    Deng, Shi-huai
    2012 INTERNATIONAL CONFERENCE ON FUTURE ENERGY, ENVIRONMENT, AND MATERIALS, PT C, 2012, 16 : 1913 - 1918
  • [30] FACTORS ASSOCIATED WITH EROSIVE RHEUMATOID ARTHRITIS, A MULTIMARKER PRINCIPAL COMPONENT ANALYSIS (PCA) AND PRINCIPAL COMPONENT REGRESSION (PCR) ANALYSIS
    Adami, G.
    Orsolini, G.
    Fassio, A.
    Viapiana, O.
    Sorio, E.
    Benini, C.
    Gatti, D.
    Bertelle, D.
    Rossini, M.
    ANNALS OF THE RHEUMATIC DISEASES, 2023, 82 : 497 - 498