Variational autoencoder-based outlier detection for high-dimensional data

被引:9
|
作者
Li, Yongmou [1 ,2 ]
Wang, Yijie [1 ,2 ]
Ma, Xingkong [2 ]
机构
[1] Natl Univ Def Technol, Natl Lab Parallel & Distributed Proc, Changsha 410073, Hunan, Peoples R China
[2] Natl Univ Def Technol, Coll Comp, Changsha 410073, Hunan, Peoples R China
基金
国家教育部科学基金资助; 中国国家自然科学基金;
关键词
Variational autoencoders; outlier detection; high-dimensional data;
D O I
10.3233/IDA-184240
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Analysis of high-dimensional data often suffers from the curse of dimensionality and the complicated correlation among dimensions. Dimension reduction methods often are used to alleviate these problems. Existing outlier detection methods based on dimension reduction usually only rely on reconstruction error to detect outlier or apply conventional outlier detection methods to the reduced data, which could deteriorate the performance of outlier detection as only considering part of the information from data. Few studies have been done to combine these two strategies to do outlier detection. In this paper, we proposed an outlier detection method based on Variational Autoencoder (VAE), which combines low-dimensional representation and reconstruction error to detect outliers. Specifically, we first model the data use VAE, then extract four outlier scores from VAE model, finally propose an ensemble method to combine the four outlier scores. The experiments conducted on six real-world datasets show that the proposed method performs better than or at least comparable to state of the art methods.
引用
收藏
页码:991 / 1002
页数:12
相关论文
共 50 条
  • [41] Manifold-based denoising, outlier detection, and dimension reduction algorithm for high-dimensional data
    Zhao, Guanghua
    Yang, Tao
    Fu, Dongmei
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2023, 14 (11) : 3923 - 3942
  • [42] VOA*: Fast Angle-Based Outlier Detection over High-Dimensional Data Streams
    Khalique, Vijdan
    Kitagawa, Hiroyuki
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2021, PT I, 2021, 12712 : 40 - 52
  • [43] Manifold-based denoising, outlier detection, and dimension reduction algorithm for high-dimensional data
    Guanghua Zhao
    Tao Yang
    Dongmei Fu
    International Journal of Machine Learning and Cybernetics, 2023, 14 : 3923 - 3942
  • [44] A High-Dimensional Outlier Detection Approach Based on Local Coulomb Force
    Zhu, Pengyun
    Zhang, Chaowei
    Li, Xiaofeng
    Zhang, Jifu
    Qin, Xiao
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (06) : 5506 - 5520
  • [45] MSS-PAE: Saving Autoencoder-based Outlier Detection from Unexpected Reconstruction
    Tan, Xu
    Yang, Jiawei
    Chen, Junqi
    Rahardja, Sylwan
    Rahardja, Susanto
    PATTERN RECOGNITION, 2025, 163
  • [46] Weighted Outlier Detection of High-Dimensional Categorical Data Using Feature Grouping
    Li, Junli
    Zhang, Jifu
    Pang, Ning
    Qin, Xiao
    IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2020, 50 (11): : 4295 - 4308
  • [47] IPMOD: An efficient outlier detection model for high-dimensional medical data streams
    Yang, Yun
    Fan, ChongJun
    Chen, Liang
    Xiong, HongLin
    EXPERT SYSTEMS WITH APPLICATIONS, 2022, 191
  • [48] Computationally Efficient Outlier Detection for High-Dimensional Data Using the MDP Algorithm
    Tsagris, Michail
    Papadakis, Manos
    Alenazi, Abdulaziz
    Alzeley, Omar
    COMPUTATION, 2024, 12 (09)
  • [49] Projected outlier detection in high-dimensional mixed-attributes data set
    Ye, Mao
    Li, Xue
    Orlowska, Maria E.
    EXPERT SYSTEMS WITH APPLICATIONS, 2009, 36 (03) : 7104 - 7113
  • [50] UNSUPERVISED ADAPTATION FOR HIGH-DIMENSIONAL WITH LIMITED-SAMPLE DATA CLASSIFICATION USING VARIATIONAL AUTOENCODER
    Mahmud, Mohammad Sultan
    Huang, Joshua Zhexue
    Fu, Xianghua
    Ruby, Rukhsana
    Wu, Kaishun
    COMPUTING AND INFORMATICS, 2021, 40 (01) : 1 - 28