Efficient density and cluster based incremental outlier detection in data streams

被引:45
|
作者
Degirmenci, Ali [1 ]
Karal, Omer [1 ]
机构
[1] Ankara Yildirim Beyazit Univ, Ayvali Mah 150,Sok Etlik Kecioren, Ankara, Turkey
关键词
LOF; DBSCAN; Outlier detection; Core KNN; Incremental learning; Data stream;
D O I
10.1016/j.ins.2022.06.013
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, a novel, parameter-free, incremental local density and cluster-based outlier factor (iLDCBOF) method is presented that unifies incremental versions of local outlier factor (LOF) and density-based spatial clustering of applications with noise (DBSCAN) to detect outliers efficiently in data streams. The iLDCBOF has many advanced advantages compared to previously reported iLOF-based studies: (1) it is based on a newly developed core k-nearest neighbor (CkNN) concept to reliably and scalably detect outliers from data streams and prevent the clustering of outliers; 2) it uses a newly-developed algorithm that automatically adjusts the value of the k (number of neighbors) parameter for different real-time applications; and 3) it uses the Mahalanobis distance metric, so its performance is not affected even for large amounts of data. The iLDCBOF method is well suited for different data stream applications because it requires no distribution assumptions, it is parameterless (determined automatically), and it is easy to implement. ROC-AUC and statistical test analysis results from extensive experiments performed on 16 different real world datasets showed that the iLDCBOF method significantly outperformed benchmark methods.(c) 2022 Elsevier Inc. All rights reserved.
引用
收藏
页码:901 / 920
页数:20
相关论文
共 50 条
  • [31] Outlier Detection Data Mining of Tax Based on Cluster
    Liu, Bin
    Xu, Guang
    Xu, Qian
    Zhang, Nan
    2012 INTERNATIONAL CONFERENCE ON MEDICAL PHYSICS AND BIOMEDICAL ENGINEERING (ICMPBE2012), 2012, 33 : 1689 - 1694
  • [32] ADINOF: adaptive density summarizing incremental natural outlier detection in data stream
    Singh, Manmohan
    Pamula, Rajendra
    NEURAL COMPUTING & APPLICATIONS, 2021, 33 (15): : 9607 - 9623
  • [33] An Efficient Distance and Density Based Outlier Detection Approach
    Zhong, Xunbiao
    Huang, Xiaoxia
    MECHANICAL ENGINEERING AND GREEN MANUFACTURING II, PTS 1 AND 2, 2012, 155-156 : 342 - 347
  • [34] ADINOF: adaptive density summarizing incremental natural outlier detection in data stream
    Manmohan Singh
    Rajendra Pamula
    Neural Computing and Applications, 2021, 33 : 9607 - 9623
  • [35] Fast Memory Efficient Local Outlier Detection in Data Streams (Extended Abstract)
    Salehi, Mahsa
    Leckie, Christopher
    Bezdek, James C.
    Vaithianathan, Tharshan
    Zhang, Xuyun
    2017 IEEE 33RD INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2017), 2017, : 51 - 52
  • [36] Outlier Detection of Traction Energy Consumption Based on Local Density and Cluster for Time Series Data
    Zhang, Chengxi
    Xun, Jing
    Ji, Zhihui
    Yin, Chenkun
    Cao, Jiang
    2023 IEEE 12TH DATA DRIVEN CONTROL AND LEARNING SYSTEMS CONFERENCE, DDCLS, 2023, : 1288 - 1295
  • [37] An Effective Minimal Probing Approach With Micro-Cluster for Distance-Based Outlier Detection in Data Streams
    Bah, Mohamed Jaward
    Wang, Hongzhi
    Hammad, Mohamed
    Zeshan, Furkh
    Aljuaid, Hanan
    IEEE ACCESS, 2019, 7 : 154922 - 154934
  • [38] Adaptive Threshold for Outlier Detection on Data Streams
    Clark, James P.
    Liu, Zhen
    Japkowicz, Nathalie
    2018 IEEE 5TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA), 2018, : 41 - 49
  • [39] A Survey of Outlier Detection Algorithms for Data Streams
    Tamboli, Jinita
    Shukla, Madhu
    PROCEEDINGS OF THE 10TH INDIACOM - 2016 3RD INTERNATIONAL CONFERENCE ON COMPUTING FOR SUSTAINABLE GLOBAL DEVELOPMENT, 2016, : 3535 - 3540
  • [40] Outlier and anomaly pattern detection on data streams
    Cheong Hee Park
    The Journal of Supercomputing, 2019, 75 : 6118 - 6128