FAAD:an unsupervised fast and accurate anomaly detection method for a multi-dimensional sequence over data stream

被引:0
|
作者
Bin LI [1 ]
Yi-jie WANG [1 ]
Dong-sheng YANG [2 ]
Yong-mou LI [1 ]
Xing-kong MA [1 ]
机构
[1] Science and Technology on Parallel and Distributed Processing Laboratory, College of Computer,National University of Defense Technology
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
Data stream; Multi-dimensional sequence; Anomaly detection; Concept drift; Feature selection;
D O I
暂无
中图分类号
TP311.13 [];
学科分类号
1201 ;
摘要
Recently, sequence anomaly detection has been widely used in many fields. Sequence data in these fields are usually multi-dimensional over the data stream. It is a challenge to design an anomaly detection method for a multi-dimensional sequence over the data stream to satisfy the requirements of accuracy and high speed. It is because:(1) Redundant dimensions in sequence data and large state space lead to a poor ability for sequence modeling;(2) Anomaly detection cannot adapt to the high-speed nature of the data stream, especially when concept drift occurs, and it will reduce the detection rate. On one hand, most existing methods of sequence anomaly detection focus on the single-dimension sequence. On the other hand, some studies concerning multi-dimensional sequence concentrate mainly on the static database rather than the data stream. To improve the performance of anomaly detection for a multi-dimensional sequence over the data stream, we propose a novel unsupervised fast and accurate anomaly detection(FAAD) method which includes three algorithms. First, a method called "information calculation and minimum spanning tree cluster" is adopted to reduce redundant dimensions. Second, to speed up model construction and ensure the detection rate for the sequence over the data stream, we propose a method called"random sampling and subsequence partitioning based on the index probabilistic suffix tree." Last, the method called "anomaly buffer based on model dynamic adjustment" dramatically reduces the effects of concept drift in the data stream. FAAD is implemented on the streaming platform Storm to detect multi-dimensional log audit data.Compared with the existing anomaly detection methods, FAAD has a good performance in detection rate and speed without being affected by concept drift.
引用
收藏
页码:388 / 404
页数:17
相关论文
共 50 条
  • [31] A new data normalization method for unsupervised anomaly intrusion detection
    Cai, Long-zheng
    Chen, Jian
    Ke, Yun
    Chen, Tao
    Li, Zhi-gang
    JOURNAL OF ZHEJIANG UNIVERSITY-SCIENCE C-COMPUTERS & ELECTRONICS, 2010, 11 (10): : 778 - 784
  • [32] A novel method for unsupervised anomaly detection using unlabelled data
    Ismail, Abdul Samad Bin Haji
    Abdullah, Abdul Hanan
    Abu Bak, Kamalrulnizam Bin
    Bin Ngadi, Md Asri
    Dahlan, Dahliyusmanto
    Chimphlee, Witcha
    INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCES AND ITS APPLICATIONS, PROCEEDINGS, 2008, : 252 - +
  • [33] Varying-scale HCA-DBSCAN-based anomaly detection method for multi-dimensional energy data in steel industry
    Jin, Feng
    Wu, Hao
    Liu, Yang
    Zhao, Jun
    Wang, Wei
    INFORMATION SCIENCES, 2023, 647
  • [34] A new data normalization method for unsupervised anomaly intrusion detection
    Long-zheng Cai
    Jian Chen
    Yun Ke
    Tao Chen
    Zhi-gang Li
    Journal of Zhejiang University SCIENCE C, 2010, 11 : 778 - 784
  • [35] SALAD: A split active learning based unsupervised network data stream anomaly detection method using autoencoders
    Nixon, Christopher
    Sedky, Mohamed
    Champion, Justin
    Hassan, Mohamed
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 248
  • [36] A multi-phase approach for classifying multi-dimensional sequence data
    Lee, Chang-Hwan
    INTELLIGENT DATA ANALYSIS, 2015, 19 (03) : 547 - 561
  • [37] Multi-dimensional range query over encrypted data
    Shi, Elaine
    Bethencourt, John
    Chan, T-H. Hubert
    Song, Dawn
    Perrig, Adrian
    2007 IEEE SYMPOSIUM ON SECURITY AND PRIVACY, PROCEEDINGS, 2007, : 350 - +
  • [38] Unsupervised Deep Embedding for Novel Class Detection over Data Stream
    Mustafa, Ahmad M.
    Ayoade, Gbadebo
    Al-Naami, Khaled
    Khan, Latifur
    Hamlen, Kevin W.
    Thuraisingham, Bhavani
    Araujo, Frederico
    2017 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2017, : 1830 - 1839
  • [39] Strategies for data stream mining method applied in anomaly detection
    Ruxia Sun
    Sun Zhang
    Chunyong Yin
    Jin Wang
    Seungwook Min
    Cluster Computing, 2019, 22 : 399 - 408
  • [40] Strategies for data stream mining method applied in anomaly detection
    Sun, Ruxia
    Zhang, Sun
    Yin, Chunyong
    Wang, Jin
    Min, Seungwook
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2019, 22 (02): : 399 - 408