A graph regularized dimension reduction method for out-of-sample data

被引:11
|
作者
Tang, Mengfan [1 ]
Nie, Feiping [2 ,3 ]
Jain, Ramesh [1 ]
机构
[1] Univ Calif Irvine, Dept Comp Sci, Irvine, CA 92697 USA
[2] Northwestern Polytech Univ, Sch Comp Sci, Xian, Peoples R China
[3] Northwestern Polytech Univ, Ctr OPT IMagery Anal & Learning OPTIMAL, Xian, Peoples R China
关键词
Dimension reduction; Out-of-sample data; Graph regularized PCA; Manifold learning; Clustering; RECOGNITION; EIGENMAPS;
D O I
10.1016/j.neucom.2016.11.012
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Among various dimension reduction techniques, Principal Component Analysis (PCA) is specialized in treating vector data, whereas Laplacian embedding is often employed for embedding graph data. Moreover, graph regularized PCA, a combination of both techniques, has also been developed to assist the learning of a low dimensional representation of vector data by incorporating graph data. However, these approaches are confronted by the out-of-sample problem: each time when new data is added, it has to be combined with the old data before being fed into the algorithm to re-compute the eigenvectors, leading to enormous computational cost. In order to address this problem, we extend the graph regularized PCA to the graph regularized linear regression PCA (grlrPCA). grlrPCA eliminates the redundant calculation on the old data by first learning a linear function and then directly applying it to the new data for its dimension reduction. Furthermore, we derive an efficient iterative algorithm to solve grlrPCA optimization problem and show the close relatedness of grlrPCA and unsupervised Linear Discriminant Analysis at infinite regularization parameter limit. The evaluations of multiple metrics on seven realistic datasets demonstrate that grlrPCA outperforms established unsupervised dimension reduction algorithms.
引用
收藏
页码:58 / 63
页数:6
相关论文
共 50 条
  • [21] Out-of-Sample Tuning for Causal Discovery
    Biza, Konstantina
    Tsamardinos, Ioannis
    Triantafillou, Sofia
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (04) : 4963 - 4973
  • [22] Forecasting in the presence of in-sample and out-of-sample breaks
    Jiawen Xu
    Pierre Perron
    Empirical Economics, 2023, 64 : 3001 - 3035
  • [23] Sample Out-of-Sample Inference Based on Wasserstein Distance
    Blanchet, Jose
    Kang, Yang
    OPERATIONS RESEARCH, 2021, 69 (03) : 985 - 1013
  • [24] Out-of-sample extrapolation of learned manifolds
    Chin, Tat-Jun
    Suter, David
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2008, 30 (09) : 1547 - 1556
  • [25] Out-of-Sample Fusion in Risk Prediction
    Katzoff, Myron
    Zhou, Wen
    Khan, Diba
    Lu, Guanhua
    Kedem, Benjamin
    JOURNAL OF STATISTICAL THEORY AND PRACTICE, 2014, 8 (03) : 444 - 459
  • [26] Testing out-of-sample portfolio performance
    Kazak, Ekaterina
    Pohlmeier, Winfried
    INTERNATIONAL JOURNAL OF FORECASTING, 2019, 35 (02) : 540 - 554
  • [27] Mathematical analysis on out-of-sample extensions
    Wang, Jianzhong
    INTERNATIONAL JOURNAL OF WAVELETS MULTIRESOLUTION AND INFORMATION PROCESSING, 2018, 16 (05)
  • [28] AN OUT-OF-SAMPLE METHOD TO QUANTIFY SYSTEMATIC ERROR IN UNANCHORED INDIRECT COMPARISONS
    Muresan, B.
    Hu, Y.
    Heeg, B.
    Postma, M. J.
    Ouwens, M. J.
    VALUE IN HEALTH, 2018, 21 : S397 - S397
  • [29] On the use of the peaks over thresholds method for estimating out-of-sample quantiles
    El-Aroui, MA
    Diebolt, J
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2002, 39 (04) : 453 - 475
  • [30] Forecasting in the presence of in-sample and out-of-sample breaks
    Xu, Jiawen
    Perron, Pierre
    EMPIRICAL ECONOMICS, 2023, 64 (06) : 3001 - 3035