Online landmark replacement for out-of-sample dimensionality reduction methods

被引:0
|
作者
Thongprayoon, Chanon [1 ]
Masuda, Naoki [1 ,2 ,3 ]
机构
[1] SUNY Buffalo, Dept Math, Buffalo, NY 14068 USA
[2] SUNY Buffalo, Inst Artificial Intelligence & Data Sci, Buffalo, NY 14068 USA
[3] Kobe Univ, Ctr Computat Social Sci, Kobe, Japan
基金
日本科学技术振兴机构;
关键词
time-series analysis; dimensionality reduction; geometric graph; temporal networks; ANOMALY DETECTION;
D O I
10.1098/rspa.2023.0966
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
A strategy to assist visualization and analysis of large and complex datasets is dimensionality reduction, with which one maps each data point into a low-dimensional manifold. However, various dimensionality reduction techniques are computationally infeasible for large data. Out-of-sample techniques aim to resolve this difficulty; they only apply the dimensionality reduction technique on a small portion of data, referred to as landmarks, and determine the embedding coordinates of the other points using landmarks as references. Out-of-sample techniques have been applied to online settings, or when data arrive as time series. However, existing online out-of-sample techniques use either all the previous data points as landmarks or the fixed set of landmarks and therefore are potentially not good at capturing the geometry of the entire dataset when the time series is non-stationary. To address this problem, we propose an online landmark replacement algorithm for out-of-sample techniques using geometric graphs and the minimal dominating set on them. We mathematically analyse some properties of the proposed algorithm, particularly focusing on the case of landmark multi-dimensional scaling as the out-of-sample technique, and test its performance on synthetic and empirical time-series data.
引用
收藏
页数:29
相关论文
共 50 条
  • [31] ARE THE GARCH MODELS BEST IN OUT-OF-SAMPLE PERFORMANCE
    LEE, KY
    ECONOMICS LETTERS, 1991, 37 (03) : 305 - 308
  • [32] A note on the out-of-sample performance of resampled efficiency
    Bernd Scherer
    Journal of Asset Management, 2006, 7 (3-4) : 170 - 178
  • [33] On the out-of-sample predictability of stock market returns
    Guo, H
    JOURNAL OF BUSINESS, 2006, 79 (02): : 645 - 670
  • [34] Out-of-sample stock return predictability in Australia
    Dou, Yiwen
    Gallagher, David R.
    Schneider, David H.
    Walter, Terry S.
    AUSTRALIAN JOURNAL OF MANAGEMENT, 2012, 37 (03) : 461 - 479
  • [35] Comparing Out-of-Sample Performance of Machine Learning Methods to Forecast US GDP Growth
    Chu, Ba
    Qureshi, Shafiullah
    COMPUTATIONAL ECONOMICS, 2023, 62 (04) : 1567 - 1609
  • [36] Out-of-Sample Eigenvectors in Kernel Spectral Clustering
    Alzate, Carlos
    Suykens, Johan A. K.
    2011 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2011, : 2349 - 2356
  • [37] Out-of-Sample Representation Learning for Knowledge Graphs
    Albooyeh, Marjan
    Goel, Rishab
    Kazemi, Seyed Mehran
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 2657 - 2666
  • [38] Multivariate out-of-sample tests for Granger causality
    Gelper, Sarah
    Croux, Christophe
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2007, 51 (07) : 3319 - 3329
  • [39] Erratum to: Out-of-Sample Fusion in Risk Prediction
    Myron Katzoff
    Wen Zhou
    Diba Khan
    Guanhua Lu
    Benjamin Kedem
    Journal of Statistical Theory and Practice, 2014, 8 (4) : 792 - 792
  • [40] Efficient Out-of-Sample Pricing of VIX Futures
    Guo, Shuxin
    Liu, Qiang
    JOURNAL OF DERIVATIVES, 2020, 27 (03): : 126 - 139