Improved outlier detection and interpretation method for DPC clustering algorithm

被引:0
|
作者
Zhou, Yu [1 ]
Xia, Hao [1 ]
Pei, Zexuan [1 ]
机构
[1] School of Electrical Engineering, North China University of Water Resources and Electric Power, Zhengzhou,450045, China
关键词
Anomaly detection - Clustering algorithms - Nearest neighbor search;
D O I
10.11918/202305067
中图分类号
学科分类号
摘要
To address the limitatios of global outlier detection methods in detecting local outliers and the performance degradation of local anomaly factors in the presence of a large number of local outliers, this paper proposes an outlier detection and interpretation method based on an improved fast search and discovery density peak clustering algorithm (KDPC), utilizing k-nearest neighbor (KNN) and kernel density estimation (KDE) methods. This method enables simultaneous analysis of both global and local data points. Firstly, the local density of data points is calculated using the k-nearest neighbor and kernel density estimation methods instead of the local density based on the truncation distance in the traditional DPC algorithm. Secondly, the sum of the k-nearest neighbor distances of the data points is used as the global outlier and the cluster density as well as the local outliers of the data points are calculated by the KDPC clustering algorithm. Finally, the global and local outliers of the data points are multiplied as the final anomaly score. The Top-n data points with the highest anomaly score is selected as the outlier, and the global and local outliers are interpreted by constructing a global-local outlier decision diagram. Experiments were conducted using both artificial and UCI datasets and our method was compared with 10 commonly used outlier detection methods. The results show that our method achieves high detection accuracy and performance for both global and local outliers. Moreover, the AUC performance is minimally affected by the k-value. Additionally, our method is also used to analyze NBA player data, further demonstrating its practicality and effectiveness. © 2024 Harbin Institute of Technology. All rights reserved.
引用
收藏
页码:68 / 85
相关论文
共 50 条
  • [21] Research Outlier Detection Technique Base on Clustering Algorithm
    Huang Tao
    Tan Yanna
    2014 7TH CONFERENCE ON CONTROL AND AUTOMATION (CA), 2014, : 12 - 14
  • [22] An Improved Outlier Detection Algorithm to Medical Insurance
    Xie, Zhiping
    Li, Xiaoyu
    Wu, Wenyi
    Zhang, Xiaoling
    INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2016, 2016, 9937 : 436 - 445
  • [23] An Outlier Detection Algorithm Based on Probability Density Clustering
    Wang, Wei
    Ren, Yongjian
    Zhou, Renjie
    Zhang, Jilin
    INTERNATIONAL JOURNAL OF DATA WAREHOUSING AND MINING, 2023, 19 (01) : 22 - 22
  • [24] An Outlier Detection Approach Based on Improved Self-Organizing Feature Map Clustering Algorithm
    Yang, Ping
    Wang, Dan
    Wei, Zhuojun
    Dui, Xiaolin
    Li, Tong
    IEEE ACCESS, 2019, 7 : 115914 - 115925
  • [25] Outlier detection algorithm based on fast density peak clustering outlier factor
    Zhang, Zhongping
    Li, Sen
    Liu, Weixiong
    Liu, Shuxia
    Tongxin Xuebao/Journal on Communications, 2022, 43 (10): : 186 - 195
  • [26] A sequential outlier detecting method using a clustering algorithm
    Seo, Han Son
    Yoon, Min
    KOREAN JOURNAL OF APPLIED STATISTICS, 2016, 29 (04) : 699 - 706
  • [27] A New Outlier Detection Algorithm Based on Fast Density Peak Clustering Outlier Factor
    Zhang, ZhongPing
    Li, Sen
    Liu, WeiXiong
    Wang, Ying
    Li, Daisy Xin
    INTERNATIONAL JOURNAL OF DATA WAREHOUSING AND MINING, 2023, 19 (02)
  • [28] Enhancing CURE Algorithm with Stochastic Neighbor Embedding (CURE-SNE) for Improved Clustering and Outlier Detection
    Ginting, Dewi Sartika Br
    Efendi, Syahril
    Amalia
    Sihombing, Poltak
    International Journal of Advanced Computer Science and Applications, 2024, 15 (12) : 382 - 391
  • [29] Outlier detection method based on improved distance
    School of Computer Science and Engineering, South China University of Technology, Guangzhou 510640, China
    Huanan Ligong Daxue Xuebao, 2008, 9 (25-30):
  • [30] New outlier detection method based on fuzzy clustering
    Al-Zoubi, Moh'D Belal
    Al-Dahoud, Ali
    Yahya, Abdelfatah A.
    WSEAS Transactions on Information Science and Applications, 2010, 7 (05): : 681 - 690