Improved outlier detection and interpretation method for DPC clustering algorithm

被引:0
|
作者
Zhou, Yu [1 ]
Xia, Hao [1 ]
Pei, Zexuan [1 ]
机构
[1] School of Electrical Engineering, North China University of Water Resources and Electric Power, Zhengzhou,450045, China
关键词
Anomaly detection - Clustering algorithms - Nearest neighbor search;
D O I
10.11918/202305067
中图分类号
学科分类号
摘要
To address the limitatios of global outlier detection methods in detecting local outliers and the performance degradation of local anomaly factors in the presence of a large number of local outliers, this paper proposes an outlier detection and interpretation method based on an improved fast search and discovery density peak clustering algorithm (KDPC), utilizing k-nearest neighbor (KNN) and kernel density estimation (KDE) methods. This method enables simultaneous analysis of both global and local data points. Firstly, the local density of data points is calculated using the k-nearest neighbor and kernel density estimation methods instead of the local density based on the truncation distance in the traditional DPC algorithm. Secondly, the sum of the k-nearest neighbor distances of the data points is used as the global outlier and the cluster density as well as the local outliers of the data points are calculated by the KDPC clustering algorithm. Finally, the global and local outliers of the data points are multiplied as the final anomaly score. The Top-n data points with the highest anomaly score is selected as the outlier, and the global and local outliers are interpreted by constructing a global-local outlier decision diagram. Experiments were conducted using both artificial and UCI datasets and our method was compared with 10 commonly used outlier detection methods. The results show that our method achieves high detection accuracy and performance for both global and local outliers. Moreover, the AUC performance is minimally affected by the k-value. Additionally, our method is also used to analyze NBA player data, further demonstrating its practicality and effectiveness. © 2024 Harbin Institute of Technology. All rights reserved.
引用
收藏
页码:68 / 85
相关论文
共 50 条
  • [1] Outlier detection method based on improved DPC algorithm and centrifugal factor
    Xia, Hao
    Zhou, Yu
    Li, Jiguang
    Yue, Xuezhen
    Li, Jichun
    INFORMATION SCIENCES, 2024, 682
  • [2] A Spectral Clustering Algorithm for Outlier Detection
    Yang, Peng
    Huang, Biao
    2008 INTERNATIONAL SEMINAR ON FUTURE INFORMATION TECHNOLOGY AND MANAGEMENT ENGINEERING, PROCEEDINGS, 2008, : 33 - 36
  • [3] Outlier Detection Method based on Improved Two-step Clustering Algorithm and Synthetic Hypothesis Testing
    Huang, Geyu
    Zhang, Zhiming
    Yang, Wenxin
    PROCEEDINGS OF 2019 IEEE 8TH JOINT INTERNATIONAL INFORMATION TECHNOLOGY AND ARTIFICIAL INTELLIGENCE CONFERENCE (ITAIC 2019), 2019, : 915 - 919
  • [4] An Improved Semisupervised Outlier Detection Algorithm Based on Adaptive Feature Weighted Clustering
    Deng, Tingquan
    Yang, Jinhong
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2016, 2016
  • [5] A Clustering Algorithm for Tumor Gene Data Based on Improved DPC Algorithm
    Wang W.
    Gao B.
    International Journal Bioautomation, 2022, 26 (02): : 175 - 192
  • [6] A Practical Algorithm for Distributed Clustering and Outlier Detection
    Chen, Jiecao
    Azer, Erfan Sadeqi
    Zhang, Qin
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [7] An Effective Algorithm of Outlier Detection Based on Clustering
    Xia, Qingsong
    Xing, Changzheng
    Li, Na
    INTERNET OF THINGS-BK, 2012, 312 : 346 - 351
  • [8] An Outlier Detection Algorithm Based on Spectral Clustering
    Yang, Peng
    Huang, Biao
    PACIIA: 2008 PACIFIC-ASIA WORKSHOP ON COMPUTATIONAL INTELLIGENCE AND INDUSTRIAL APPLICATION, VOLS 1-3, PROCEEDINGS, 2008, : 485 - 488
  • [9] Automatic PAM clustering algorithm for outlier detection
    Zhu, Q. (qszhu@cqu.edu.cn), 1600, Academy Publisher (07):
  • [10] Outlier Detection Algorithm Based on Iterative Clustering
    古平
    罗辛
    杨瑞龙
    张程
    Journal of Donghua University(English Edition), 2015, 32 (04) : 554 - 558