A graph regularized dimension reduction method for out-of-sample data

被引:11
|
作者
Tang, Mengfan [1 ]
Nie, Feiping [2 ,3 ]
Jain, Ramesh [1 ]
机构
[1] Univ Calif Irvine, Dept Comp Sci, Irvine, CA 92697 USA
[2] Northwestern Polytech Univ, Sch Comp Sci, Xian, Peoples R China
[3] Northwestern Polytech Univ, Ctr OPT IMagery Anal & Learning OPTIMAL, Xian, Peoples R China
关键词
Dimension reduction; Out-of-sample data; Graph regularized PCA; Manifold learning; Clustering; RECOGNITION; EIGENMAPS;
D O I
10.1016/j.neucom.2016.11.012
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Among various dimension reduction techniques, Principal Component Analysis (PCA) is specialized in treating vector data, whereas Laplacian embedding is often employed for embedding graph data. Moreover, graph regularized PCA, a combination of both techniques, has also been developed to assist the learning of a low dimensional representation of vector data by incorporating graph data. However, these approaches are confronted by the out-of-sample problem: each time when new data is added, it has to be combined with the old data before being fed into the algorithm to re-compute the eigenvectors, leading to enormous computational cost. In order to address this problem, we extend the graph regularized PCA to the graph regularized linear regression PCA (grlrPCA). grlrPCA eliminates the redundant calculation on the old data by first learning a linear function and then directly applying it to the new data for its dimension reduction. Furthermore, we derive an efficient iterative algorithm to solve grlrPCA optimization problem and show the close relatedness of grlrPCA and unsupervised Linear Discriminant Analysis at infinite regularization parameter limit. The evaluations of multiple metrics on seven realistic datasets demonstrate that grlrPCA outperforms established unsupervised dimension reduction algorithms.
引用
收藏
页码:58 / 63
页数:6
相关论文
共 50 条
  • [31] Out-of-Sample Fusion in Risk Prediction
    Myron Katzoff
    Wen Zhou
    Diba Khan
    Guanhua Lu
    Benjamin Kedem
    Journal of Statistical Theory and Practice, 2014, 8 (3) : 444 - 459
  • [32] Out-of-Sample Embedding by Sparse Representation
    Raducanu, Bogdan
    Dornaika, Fadi
    STRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION, 2012, 7626 : 336 - 344
  • [33] Out-of-sample embedding by sparse representation
    Raducanu, Bogdan
    Dornaika, Fadi
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2012, 7626 LNCS : 336 - 344
  • [34] In-Sample and Out-of-Sample Predictability of Cryptocurrency Returns
    Park, Kyungjin
    Lee, Hojin
    EAST ASIAN ECONOMIC REVIEW, 2023, 27 (03) : 213 - 242
  • [35] A note on in-sample and out-of-sample tests for Granger causality
    Chen, SS
    JOURNAL OF FORECASTING, 2005, 24 (06) : 453 - 464
  • [36] A note on the out-of-sample performance of resampled efficiency
    Scherer, Bernd
    JOURNAL OF ASSET MANAGEMENT, 2006, 7 (3-4) : 170 - 178
  • [37] Out-of-Sample Performance of Mutual Fund Predictors
    Jones, Christopher S.
    Mo, Haitao
    REVIEW OF FINANCIAL STUDIES, 2021, 34 (01): : 149 - 193
  • [38] Out-of-Sample Predictability of the Equity Risk Premium
    de Almeida, Daniel
    Fuertes, Ana-Maria
    Hotta, Luiz Koodi
    MATHEMATICS, 2025, 13 (02)
  • [39] GAUSSIAN PROCESS REGRESSION FOR OUT-OF-SAMPLE EXTENSION
    Barkan, Oren
    Weill, Jonathan
    Averbuch, Amir
    2016 IEEE 26TH INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2016,
  • [40] Improving Out-of-Sample Prediction of Quality of MRIQC
    Esteban, Oscar
    Poldrack, Russell A.
    Gorgolewski, Krzysztof J.
    INTRAVASCULAR IMAGING AND COMPUTER ASSISTED STENTING AND LARGE-SCALE ANNOTATION OF BIOMEDICAL DATA AND EXPERT LABEL SYNTHESIS, 2018, 11043 : 190 - 199