Online landmark replacement for out-of-sample dimensionality reduction methods

被引:0
|
作者
Thongprayoon, Chanon [1 ]
Masuda, Naoki [1 ,2 ,3 ]
机构
[1] SUNY Buffalo, Dept Math, Buffalo, NY 14068 USA
[2] SUNY Buffalo, Inst Artificial Intelligence & Data Sci, Buffalo, NY 14068 USA
[3] Kobe Univ, Ctr Computat Social Sci, Kobe, Japan
基金
日本科学技术振兴机构;
关键词
time-series analysis; dimensionality reduction; geometric graph; temporal networks; ANOMALY DETECTION;
D O I
10.1098/rspa.2023.0966
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
A strategy to assist visualization and analysis of large and complex datasets is dimensionality reduction, with which one maps each data point into a low-dimensional manifold. However, various dimensionality reduction techniques are computationally infeasible for large data. Out-of-sample techniques aim to resolve this difficulty; they only apply the dimensionality reduction technique on a small portion of data, referred to as landmarks, and determine the embedding coordinates of the other points using landmarks as references. Out-of-sample techniques have been applied to online settings, or when data arrive as time series. However, existing online out-of-sample techniques use either all the previous data points as landmarks or the fixed set of landmarks and therefore are potentially not good at capturing the geometry of the entire dataset when the time series is non-stationary. To address this problem, we propose an online landmark replacement algorithm for out-of-sample techniques using geometric graphs and the minimal dominating set on them. We mathematically analyse some properties of the proposed algorithm, particularly focusing on the case of landmark multi-dimensional scaling as the out-of-sample technique, and test its performance on synthetic and empirical time-series data.
引用
收藏
页数:29
相关论文
共 50 条
  • [21] Out-of-Sample Embedding by Sparse Representation
    Raducanu, Bogdan
    Dornaika, Fadi
    STRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION, 2012, 7626 : 336 - 344
  • [22] Out-of-sample embedding by sparse representation
    Raducanu, Bogdan
    Dornaika, Fadi
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2012, 7626 LNCS : 336 - 344
  • [23] In-Sample and Out-of-Sample Predictability of Cryptocurrency Returns
    Park, Kyungjin
    Lee, Hojin
    EAST ASIAN ECONOMIC REVIEW, 2023, 27 (03) : 213 - 242
  • [24] A weighted kernel PCA formulation with out-of-sample extensions for spectral clustering methods
    Alzate, Carlos
    Suykens, Johan A. K.
    2006 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORK PROCEEDINGS, VOLS 1-10, 2006, : 138 - +
  • [25] A note on in-sample and out-of-sample tests for Granger causality
    Chen, SS
    JOURNAL OF FORECASTING, 2005, 24 (06) : 453 - 464
  • [26] A note on the out-of-sample performance of resampled efficiency
    Scherer, Bernd
    JOURNAL OF ASSET MANAGEMENT, 2006, 7 (3-4) : 170 - 178
  • [27] Out-of-Sample Performance of Mutual Fund Predictors
    Jones, Christopher S.
    Mo, Haitao
    REVIEW OF FINANCIAL STUDIES, 2021, 34 (01): : 149 - 193
  • [28] Out-of-Sample Predictability of the Equity Risk Premium
    de Almeida, Daniel
    Fuertes, Ana-Maria
    Hotta, Luiz Koodi
    MATHEMATICS, 2025, 13 (02)
  • [29] GAUSSIAN PROCESS REGRESSION FOR OUT-OF-SAMPLE EXTENSION
    Barkan, Oren
    Weill, Jonathan
    Averbuch, Amir
    2016 IEEE 26TH INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2016,
  • [30] Improving Out-of-Sample Prediction of Quality of MRIQC
    Esteban, Oscar
    Poldrack, Russell A.
    Gorgolewski, Krzysztof J.
    INTRAVASCULAR IMAGING AND COMPUTER ASSISTED STENTING AND LARGE-SCALE ANNOTATION OF BIOMEDICAL DATA AND EXPERT LABEL SYNTHESIS, 2018, 11043 : 190 - 199