Learning manifolds from non-stationary streams

被引:0
|
作者
Mahapatra, Suchismit [1 ]
Chandola, Varun [1 ]
机构
[1] SUNY Buffalo, Dept Comp Sci, Buffalo, NY 14261 USA
关键词
Manifold learning; Dimension reduction; Streaming data; Isomap; Gaussian process; Primary; NONLINEAR DIMENSIONALITY REDUCTION; EIGENMAPS;
D O I
10.1186/s40537-023-00872-8
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Streaming adaptations of manifold learning based dimensionality reduction methods, such as Isomap, are based on the assumption that a small initial batch of observations is enough for exact learning of the manifold, while remaining streaming data instances can be cheaply mapped to this manifold. However, there are no theoretical results to show that this core assumption is valid. Moreover, such methods typically assume that the underlying data distribution is stationary and are not equipped to detect, or handle, sudden changes or gradual drifts in the distribution that may occur when the data is streaming. We present theoretical results to show that the quality of a manifold asymptotically converges as the size of data increases. We then show that a Gaussian Process Regression (GPR) model, that uses a manifold-specific kernel function and is trained on an initial batch of sufficient size, can closely approximate the state-of-art streaming Isomap algorithms, and the predictive variance obtained from the GPR prediction can be employed as an effective detector of changes in the underlying data distribution. Results on several synthetic and real data sets show that the resulting algorithm can effectively learn lower dimensional representation of high dimensional data in a streaming setting, while identifying shifts in the generative distribution. For instance, key findings on a Gas sensor array data set show that our method can detect changes in the underlying data stream, triggered due to real-world factors, such as introduction of a new gas in the system, while efficiently mapping data on a low-dimensional manifold.
引用
收藏
页数:24
相关论文
共 50 条
  • [41] Tempo Adaptation in Non-stationary Reinforcement Learning
    Lee, Hyunin
    Ding, Yuhao
    Lee, Jongmin
    Jin, Ming
    Lavaei, Javad
    Sojoudi, Somayeh
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [42] Non-Stationary Bayesian Learning for Global Sustainability
    Bhardwaj, Kartikeya
    Marculescu, Radu
    IEEE TRANSACTIONS ON SUSTAINABLE COMPUTING, 2017, 2 (03): : 304 - 316
  • [43] Learning Non-Stationary Dynamic Bayesian Networks
    Robinson, Joshua W.
    Hartemink, Alexander J.
    JOURNAL OF MACHINE LEARNING RESEARCH, 2010, 11 : 3647 - 3680
  • [44] Learning non-stationary conditional probability distributions
    Husmeier, D
    NEURAL NETWORKS, 2000, 13 (03) : 287 - 290
  • [45] Factored Adaptation for Non-stationary Reinforcement Learning
    Feng, Fan
    Huang, Biwei
    Zhang, Kun
    Magliacane, Sara
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [46] NGPCA: Clustering of high-dimensional and non-stationary data streams
    Migenda, Nico
    Moeller, Ralf
    Schenck, Wolfram
    SOFTWARE IMPACTS, 2024, 20
  • [47] Incremental Ensemble Classifier Addressing Non-Stationary Fast Data Streams
    Parker, Brandon S.
    Khan, Latifur
    Bifet, Albert
    2014 IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOP (ICDMW), 2014, : 716 - 723
  • [48] MIXTURE SOURCE IDENTIFICATION IN NON-STATIONARY DATA STREAMS WITH APPLICATIONS IN COMPRESSION
    Abdi, Afshin
    Fekri, Faramarz
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 2502 - 2506
  • [49] Ensemble of online neural networks for non-stationary and imbalanced data streams
    Ghazikhani, Adel
    Monsefi, Reza
    Yazdi, Hadi Sadoghi
    NEUROCOMPUTING, 2013, 122 : 535 - 544
  • [50] Online Oversampling for Sparsely Labeled Imbalanced and Non-Stationary Data Streams
    Korycki, Lukasz
    Krawczyk, Bartosz
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,