FAAD:an unsupervised fast and accurate anomaly detection method for a multi-dimensional sequence over data stream

被引:0
|
作者
Bin LI [1 ]
Yi-jie WANG [1 ]
Dong-sheng YANG [2 ]
Yong-mou LI [1 ]
Xing-kong MA [1 ]
机构
[1] Science and Technology on Parallel and Distributed Processing Laboratory, College of Computer,National University of Defense Technology
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
Data stream; Multi-dimensional sequence; Anomaly detection; Concept drift; Feature selection;
D O I
暂无
中图分类号
TP311.13 [];
学科分类号
1201 ;
摘要
Recently, sequence anomaly detection has been widely used in many fields. Sequence data in these fields are usually multi-dimensional over the data stream. It is a challenge to design an anomaly detection method for a multi-dimensional sequence over the data stream to satisfy the requirements of accuracy and high speed. It is because:(1) Redundant dimensions in sequence data and large state space lead to a poor ability for sequence modeling;(2) Anomaly detection cannot adapt to the high-speed nature of the data stream, especially when concept drift occurs, and it will reduce the detection rate. On one hand, most existing methods of sequence anomaly detection focus on the single-dimension sequence. On the other hand, some studies concerning multi-dimensional sequence concentrate mainly on the static database rather than the data stream. To improve the performance of anomaly detection for a multi-dimensional sequence over the data stream, we propose a novel unsupervised fast and accurate anomaly detection(FAAD) method which includes three algorithms. First, a method called "information calculation and minimum spanning tree cluster" is adopted to reduce redundant dimensions. Second, to speed up model construction and ensure the detection rate for the sequence over the data stream, we propose a method called"random sampling and subsequence partitioning based on the index probabilistic suffix tree." Last, the method called "anomaly buffer based on model dynamic adjustment" dramatically reduces the effects of concept drift in the data stream. FAAD is implemented on the streaming platform Storm to detect multi-dimensional log audit data.Compared with the existing anomaly detection methods, FAAD has a good performance in detection rate and speed without being affected by concept drift.
引用
收藏
页码:388 / 404
页数:17
相关论文
共 50 条
  • [41] Unsupervised Log Anomaly Detection Method Based on Multi-Feature
    He, Shiming
    Deng, Tuo
    Chen, Bowen
    Sherratt, R. Simon
    Wang, Jin
    CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 76 (01): : 517 - 541
  • [42] A Multi-Scale A Contrario method for Unsupervised Image Anomaly Detection
    Tailanian, Matias
    Muse, Pablo
    Pardo, Alvaro
    20TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2021), 2021, : 179 - 184
  • [43] An Iterative Method for Unsupervised Robust Anomaly Detection Under Data Contamination
    Kim, Minkyung
    Yu, Jongmin
    Kim, Junsik
    Oh, Tae-Hyun
    Choi, Jun Kyun
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (10) : 13327 - 13339
  • [44] Understanding Your History: Multi-dimensional Data Stream Visualization of Personal Lifelogging Data
    Hong, Minsung
    Jung, Jason J.
    2017 13TH INTERNATIONAL CONFERENCE ON INTELLIGENT ENVIRONMENTS (IE 2017), 2017, : 164 - 167
  • [45] Transformer-Based Method for Unsupervised Anomaly Detection of Flight Data
    Yu, Hao
    Wu, Honglan
    Sun, Youchao
    Liu, Hao
    2023 ASIA-PACIFIC INTERNATIONAL SYMPOSIUM ON AEROSPACE TECHNOLOGY, VOL I, APISAT 2023, 2024, 1050 : 1816 - 1826
  • [46] A Histogram Method for Summarizing Multi-Dimensional Probabilistic Data
    Iqbal, Ashraf
    Wang, Hai
    Gao, Qigang
    4TH INTERNATIONAL CONFERENCE ON AMBIENT SYSTEMS, NETWORKS AND TECHNOLOGIES (ANT 2013), THE 3RD INTERNATIONAL CONFERENCE ON SUSTAINABLE ENERGY INFORMATION TECHNOLOGY (SEIT-2013), 2013, 19 : 971 - 976
  • [47] All Eyes on You: Distributed Multi-Dimensional IoT Microservice Anomaly Detection
    Pahl, Marc-Oliver
    Aubet, Francois-Xavier
    2018 14TH INTERNATIONAL CONFERENCE ON NETWORK AND SERVICE MANAGEMENT (CNSM), 2018, : 72 - 80
  • [48] Big Data Stream Anomaly Detection with Spectral Method for UWB Radar Data
    Yun, Ying
    Wang, Wei
    PROCEEDINGS OF THE THIRD INTERNATIONAL CONFERENCE ON COMMUNICATIONS, SIGNAL PROCESSING, AND SYSTEMS, 2015, 322 : 253 - 259
  • [49] Optimization over Continuous and Multi-dimensional Decisions with Observational Data
    Bertsimas, Dimitris
    McCord, Christopher
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [50] Multi-dimensional Probabilistic Regression over Imprecise Data Streams
    Gao, Ran
    Xie, Xike
    Zou, Kai
    Pedersen, Torben Bach
    PROCEEDINGS OF THE ACM WEB CONFERENCE 2022 (WWW'22), 2022, : 3317 - 3326