Efficient density and cluster based incremental outlier detection in data streams

被引:45
|
作者
Degirmenci, Ali [1 ]
Karal, Omer [1 ]
机构
[1] Ankara Yildirim Beyazit Univ, Ayvali Mah 150,Sok Etlik Kecioren, Ankara, Turkey
关键词
LOF; DBSCAN; Outlier detection; Core KNN; Incremental learning; Data stream;
D O I
10.1016/j.ins.2022.06.013
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, a novel, parameter-free, incremental local density and cluster-based outlier factor (iLDCBOF) method is presented that unifies incremental versions of local outlier factor (LOF) and density-based spatial clustering of applications with noise (DBSCAN) to detect outliers efficiently in data streams. The iLDCBOF has many advanced advantages compared to previously reported iLOF-based studies: (1) it is based on a newly developed core k-nearest neighbor (CkNN) concept to reliably and scalably detect outliers from data streams and prevent the clustering of outliers; 2) it uses a newly-developed algorithm that automatically adjusts the value of the k (number of neighbors) parameter for different real-time applications; and 3) it uses the Mahalanobis distance metric, so its performance is not affected even for large amounts of data. The iLDCBOF method is well suited for different data stream applications because it requires no distribution assumptions, it is parameterless (determined automatically), and it is easy to implement. ROC-AUC and statistical test analysis results from extensive experiments performed on 16 different real world datasets showed that the iLDCBOF method significantly outperformed benchmark methods.(c) 2022 Elsevier Inc. All rights reserved.
引用
收藏
页码:901 / 920
页数:20
相关论文
共 50 条
  • [41] Outlier and anomaly pattern detection on data streams
    Park, Cheong Hee
    JOURNAL OF SUPERCOMPUTING, 2019, 75 (09): : 6118 - 6128
  • [42] Attribute Outlier Detection over Data Streams
    Cao, Hui
    Zhou, Yongluan
    Shou, Lidan
    Chen, Gang
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, PT II, PROCEEDINGS, 2010, 5982 : 216 - +
  • [43] Trajectory Outlier Detection on Trajectory Data Streams
    Cao, Keyan
    Liu, Yefan
    Meng, Gongjie
    Liu, Haoli
    Miao, Anchen
    Xu, Jingke
    IEEE Access, 2020, 8 : 34187 - 34196
  • [44] Trajectory Outlier Detection on Trajectory Data Streams
    Cao, Keyan
    Liu, Yefan
    Meng, Gongjie
    Liu, Haoli
    Miao, Anchen
    Xu, Jingke
    IEEE ACCESS, 2020, 8 : 34187 - 34196
  • [45] Outlier detection over data streams: Survey
    Brahmi Z.
    Souiden I.
    International Journal of Business Intelligence and Data Mining, 2021, 19 (04) : 481 - 507
  • [46] Continuous Outlier Detection on Uncertain Data Streams
    Shaikh, Salman Ahmed
    Kitagawa, Hiroyuki
    2014 IEEE NINTH INTERNATIONAL CONFERENCE ON INTELLIGENT SENSORS, SENSOR NETWORKS AND INFORMATION PROCESSING (IEEE ISSNIP 2014), 2014,
  • [47] KDE based outlier detection on distributed data streams in multimedia network
    Zhigao Zheng
    Hwa-Young Jeong
    Tao Huang
    Jiangbo Shu
    Multimedia Tools and Applications, 2017, 76 : 18027 - 18045
  • [48] KDE based outlier detection on distributed data streams in multimedia network
    Zheng, Zhigao
    Jeong, Hwa-Young
    Huang, Tao
    Shu, Jiangbo
    MULTIMEDIA TOOLS AND APPLICATIONS, 2017, 76 (17) : 18027 - 18045
  • [49] A Cluster-Based Outlier Detection Scheme for Multivariate Data
    Jobe, J. Marcus
    Pokojovy, Michael
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2015, 110 (512) : 1543 - 1551
  • [50] IPMOD: An efficient outlier detection model for high-dimensional medical data streams
    Yang, Yun
    Fan, ChongJun
    Chen, Liang
    Xiong, HongLin
    EXPERT SYSTEMS WITH APPLICATIONS, 2022, 191