Improved outlier detection and interpretation method for DPC clustering algorithm

被引:0
|
作者
Zhou, Yu [1 ]
Xia, Hao [1 ]
Pei, Zexuan [1 ]
机构
[1] School of Electrical Engineering, North China University of Water Resources and Electric Power, Zhengzhou,450045, China
关键词
Anomaly detection - Clustering algorithms - Nearest neighbor search;
D O I
10.11918/202305067
中图分类号
学科分类号
摘要
To address the limitatios of global outlier detection methods in detecting local outliers and the performance degradation of local anomaly factors in the presence of a large number of local outliers, this paper proposes an outlier detection and interpretation method based on an improved fast search and discovery density peak clustering algorithm (KDPC), utilizing k-nearest neighbor (KNN) and kernel density estimation (KDE) methods. This method enables simultaneous analysis of both global and local data points. Firstly, the local density of data points is calculated using the k-nearest neighbor and kernel density estimation methods instead of the local density based on the truncation distance in the traditional DPC algorithm. Secondly, the sum of the k-nearest neighbor distances of the data points is used as the global outlier and the cluster density as well as the local outliers of the data points are calculated by the KDPC clustering algorithm. Finally, the global and local outliers of the data points are multiplied as the final anomaly score. The Top-n data points with the highest anomaly score is selected as the outlier, and the global and local outliers are interpreted by constructing a global-local outlier decision diagram. Experiments were conducted using both artificial and UCI datasets and our method was compared with 10 commonly used outlier detection methods. The results show that our method achieves high detection accuracy and performance for both global and local outliers. Moreover, the AUC performance is minimally affected by the k-value. Additionally, our method is also used to analyze NBA player data, further demonstrating its practicality and effectiveness. © 2024 Harbin Institute of Technology. All rights reserved.
引用
收藏
页码:68 / 85
相关论文
共 50 条
  • [31] Improved DPC Clustering Algorithm with Neighbor Density Distribution Optimized Sample Assignment
    Ji X.
    Zhang T.
    Zhu J.
    Liu S.
    Li X.
    Huanan Ligong Daxue Xuebao/Journal of South China University of Technology (Natural Science), 2019, 47 (02): : 98 - 105
  • [32] An Outlier Detection Algorithm for Data Streams Based on Fuzzy Clustering
    Su, Xiaoke
    Qin, Yuming
    Wan, Renxia
    PROGRESS IN INTELLIGENCE COMPUTATION AND APPLICATIONS, 2008, : 109 - 112
  • [33] A Novel k-means Algorithm for Clustering and Outlier Detection
    Zhou, Yinghua
    Yu, Hong
    Cai, Xuemei
    2009 SECOND INTERNATIONAL CONFERENCE ON FUTURE INFORMATION TECHNOLOGY AND MANAGEMENT ENGINEERING, FITME 2009, 2009, : 476 - +
  • [34] An Outlier Detection Algorithm in Wireless Sensor Network Based on Clustering
    Niu, Kun
    Zhao, Fang
    Qiao, Xiuquan
    2013 15TH IEEE INTERNATIONAL CONFERENCE ON COMMUNICATION TECHNOLOGY (ICCT), 2013, : 433 - 437
  • [35] An auto-stopped hierarchical clustering algorithm integrating outlier detection algorithm
    Lv, TY
    Su, TX
    Wang, ZX
    Zuo, WL
    ADVANCES IN WEB-AGE INFORMATION MANAGEMENT, PROCEEDINGS, 2005, 3739 : 464 - 474
  • [36] An Outlier Detection Method Based on PageRank Algorithm
    Huang, Zhan
    Long, Shun
    Jiang, Yuying
    Chen, Qian
    PROCEEDINGS OF 2016 12TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND SECURITY (CIS), 2016, : 331 - 335
  • [37] DDoS Detection Using CURE Clustering Algorithm with Outlier Removal Clustering for Handling Outliers
    Laksono, Muhammad Agung Tri
    Purwanto, Yudha
    Novianty, Astri
    2015 INTERNATIONAL CONFERENCE ON CONTROL, ELECTRONICS, RENEWABLE ENERGY AND COMMUNICATIONS (ICCEREC), 2015, : 12 - 18
  • [38] Design of network intrusion detection system based on parallel DPC clustering algorithm
    Wang, Jing
    Han, Dezhi
    INTERNATIONAL JOURNAL OF EMBEDDED SYSTEMS, 2020, 13 (03) : 318 - 327
  • [39] A Hybrid Outlier Detection Algorithm Based On Partitioning Clustering And Density Measures
    Rizk, Hamada
    Elgokhy, Sherin
    Sarhan, Amany
    2015 TENTH INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING & SYSTEMS (ICCES), 2015, : 175 - 181
  • [40] Statistical hierarchical clustering algorithm for outlier detection in evolving data streams
    Dalibor Krleža
    Boris Vrdoljak
    Mario Brčić
    Machine Learning, 2021, 110 : 139 - 184