Online Outlier Detection for Data Streams

被引:0
|
作者
Sadik, Shiblee [1 ]
Gruenwald, Le [1 ]
机构
[1] Univ Oklahoma, Norman, OK 73019 USA
关键词
Knowledge Discovery; Data Mining; Stream Databases;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Outlier detection is a well established area of statistics but most of the existing outlier detection techniques are designed for applications where the entire dataset is available for random access. A typical outlier detection technique constructs a standard data distribution or model and identifies the deviated data points from the model as outliers. Evidently these techniques are not suitable for online data streams where the entire dataset, due to its unbounded volume, is not available for random access. Moreover, the data distribution in data streams change over time which challenges the existing outlier detection techniques that assume a constant standard data distribution for the entire dataset. In addition, data streams are characterized by uncertainty which imposes further complexity. In this paper we propose an adaptive, online outlier detection technique addressing the aforementioned characteristics of data streams, called Adaptive Outlier Detection for Data Streams (A-ODDS), which identifies outliers with respect to all the received data points as well as temporally close data points. The temporally close data points are selected based on time and change of data distribution. We also present an efficient and online implementation of the technique and a performance study showing the superiority of A-ODDS over existing techniques in terms of accuracy and execution time on a real-life dataset collected from meteorological applications.
引用
收藏
页码:88 / 96
页数:9
相关论文
共 50 条
  • [41] OHODIN - Online Anomaly Detection for Data Streams
    Gruhl, Christian
    Tomforde, Sven
    2021 IEEE INTERNATIONAL CONFERENCE ON AUTONOMIC COMPUTING AND SELF-ORGANIZING SYSTEMS COMPANION (ACSOS-C 2021), 2021, : 193 - 197
  • [42] KDE based outlier detection on distributed data streams in multimedia network
    Zhigao Zheng
    Hwa-Young Jeong
    Tao Huang
    Jiangbo Shu
    Multimedia Tools and Applications, 2017, 76 : 18027 - 18045
  • [43] DILOF: Effective and Memory Efficient Local Outlier Detection in Data Streams
    Na, Gyoung S.
    Kim, Donghyun
    Yu, Hwanjo
    KDD'18: PROCEEDINGS OF THE 24TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2018, : 1993 - 2002
  • [44] Efficient density and cluster based incremental outlier detection in data streams
    Degirmenci, Ali
    Karal, Omer
    INFORMATION SCIENCES, 2022, 607 : 901 - 920
  • [45] KDE based outlier detection on distributed data streams in multimedia network
    Zheng, Zhigao
    Jeong, Hwa-Young
    Huang, Tao
    Shu, Jiangbo
    MULTIMEDIA TOOLS AND APPLICATIONS, 2017, 76 (17) : 18027 - 18045
  • [46] Statistical hierarchical clustering algorithm for outlier detection in evolving data streams
    Dalibor Krleža
    Boris Vrdoljak
    Mario Brčić
    Machine Learning, 2021, 110 : 139 - 184
  • [47] A Framework for Outlier Detection in Evolving Data Streams by Weighting Attributes in Clustering
    Yogita
    Toshniwal, Durga
    2ND INTERNATIONAL CONFERENCE ON COMMUNICATION, COMPUTING & SECURITY [ICCCS-2012], 2012, 1 : 214 - 222
  • [48] Online Clustering for Evolving Data Streams with Online Anomaly Detection
    Chenaghlou, Milad
    Moshtaghi, Masud
    Leckie, Christopher
    Salehi, Mahsa
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2018, PT II, 2018, 10938 : 506 - 519
  • [49] Fast outlier detection algorithm for high dimensional categorical data streams
    Zhou, Xiao-Yun
    Sun, Zhi-Hui
    Zhang, Bai-Li
    Yang, Yi-Dong
    Ruan Jian Xue Bao/Journal of Software, 2007, 18 (04): : 933 - 942
  • [50] Statistical hierarchical clustering algorithm for outlier detection in evolving data streams
    Krleza, Dalibor
    Vrdoljak, Boris
    Brcic, Mario
    MACHINE LEARNING, 2021, 110 (01) : 139 - 184