A graph regularized dimension reduction method for out-of-sample data

被引:11
|
作者
Tang, Mengfan [1 ]
Nie, Feiping [2 ,3 ]
Jain, Ramesh [1 ]
机构
[1] Univ Calif Irvine, Dept Comp Sci, Irvine, CA 92697 USA
[2] Northwestern Polytech Univ, Sch Comp Sci, Xian, Peoples R China
[3] Northwestern Polytech Univ, Ctr OPT IMagery Anal & Learning OPTIMAL, Xian, Peoples R China
关键词
Dimension reduction; Out-of-sample data; Graph regularized PCA; Manifold learning; Clustering; RECOGNITION; EIGENMAPS;
D O I
10.1016/j.neucom.2016.11.012
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Among various dimension reduction techniques, Principal Component Analysis (PCA) is specialized in treating vector data, whereas Laplacian embedding is often employed for embedding graph data. Moreover, graph regularized PCA, a combination of both techniques, has also been developed to assist the learning of a low dimensional representation of vector data by incorporating graph data. However, these approaches are confronted by the out-of-sample problem: each time when new data is added, it has to be combined with the old data before being fed into the algorithm to re-compute the eigenvectors, leading to enormous computational cost. In order to address this problem, we extend the graph regularized PCA to the graph regularized linear regression PCA (grlrPCA). grlrPCA eliminates the redundant calculation on the old data by first learning a linear function and then directly applying it to the new data for its dimension reduction. Furthermore, we derive an efficient iterative algorithm to solve grlrPCA optimization problem and show the close relatedness of grlrPCA and unsupervised Linear Discriminant Analysis at infinite regularization parameter limit. The evaluations of multiple metrics on seven realistic datasets demonstrate that grlrPCA outperforms established unsupervised dimension reduction algorithms.
引用
收藏
页码:58 / 63
页数:6
相关论文
共 50 条
  • [41] ARE THE GARCH MODELS BEST IN OUT-OF-SAMPLE PERFORMANCE
    LEE, KY
    ECONOMICS LETTERS, 1991, 37 (03) : 305 - 308
  • [42] A note on the out-of-sample performance of resampled efficiency
    Bernd Scherer
    Journal of Asset Management, 2006, 7 (3-4) : 170 - 178
  • [43] On the out-of-sample predictability of stock market returns
    Guo, H
    JOURNAL OF BUSINESS, 2006, 79 (02): : 645 - 670
  • [44] Out-of-sample stock return predictability in Australia
    Dou, Yiwen
    Gallagher, David R.
    Schneider, David H.
    Walter, Terry S.
    AUSTRALIAN JOURNAL OF MANAGEMENT, 2012, 37 (03) : 461 - 479
  • [45] Out-of-Sample Eigenvectors in Kernel Spectral Clustering
    Alzate, Carlos
    Suykens, Johan A. K.
    2011 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2011, : 2349 - 2356
  • [46] Out-of-Sample Representation Learning for Knowledge Graphs
    Albooyeh, Marjan
    Goel, Rishab
    Kazemi, Seyed Mehran
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 2657 - 2666
  • [47] Multivariate out-of-sample tests for Granger causality
    Gelper, Sarah
    Croux, Christophe
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2007, 51 (07) : 3319 - 3329
  • [48] Erratum to: Out-of-Sample Fusion in Risk Prediction
    Myron Katzoff
    Wen Zhou
    Diba Khan
    Guanhua Lu
    Benjamin Kedem
    Journal of Statistical Theory and Practice, 2014, 8 (4) : 792 - 792
  • [49] Efficient Out-of-Sample Pricing of VIX Futures
    Guo, Shuxin
    Liu, Qiang
    JOURNAL OF DERIVATIVES, 2020, 27 (03): : 126 - 139
  • [50] Dimensionality Reduction for Hyperspectral Data Based on Sample-Dependent Repulsion Graph Regularized Auto-encoder
    WANG Xuesong
    KONG Yi
    CHENG Yuhu
    ChineseJournalofElectronics, 2017, 26 (06) : 1233 - 1238