A Novel Unsupervised Outlier Detection Algorithm Based on Mutual Information and Reduced Spectral Clustering

被引:1
|
作者
Huang, Yuehua [1 ,2 ,3 ]
Liu, Wenfen [1 ,2 ,3 ]
Li, Song [1 ,2 ]
Guo, Ying [1 ,2 ]
Chen, Wen [1 ,2 ]
机构
[1] Guilin Univ Elect Technol, Sch Comp Sci & Informat Secur, Guilin 541004, Peoples R China
[2] Guilin Univ Elect Technol, Sch Software Engn, Guilin 541004, Peoples R China
[3] Guangxi Key Lab Cryptog & Informat Secur, Guilin 541004, Peoples R China
基金
中国国家自然科学基金;
关键词
outlier detection; unsupervised; mutual information; spectral clustering;
D O I
10.3390/electronics12234864
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Outlier detection is an essential research field in data mining, especially in the areas of network security, credit card fraud detection, industrial flaw detection, etc. The existing outlier detection algorithms, which can be divided into supervised methods and unsupervised methods, suffer from the following problems: curse of dimensionality, lack of labeled data, and hyperparameter tuning. To address these issues, we present a novel unsupervised outlier detection algorithm based on mutual information and reduced spectral clustering, called MISC-OD (Mutual Information and reduced Spectral Clustering-Outlier Detection). MISC-OD first constructs a mutual information matrix between features, then, by applying reduced spectral clustering, divides the feature set into subsets, utilizing the LOF (Local Outlier Factor) for outlier detection within each subset and combining the outlier scores found within each subset. Finally, it outputs the outlier score. Our contributions are as follows: (1) we propose a novel outlier detection method called MISC-OD with high interpretability and scalability; (2) numerous experiments on 18 benchmark datasets demonstrate the superior performance of the MISC-OD algorithm compared with eight state-of-the-art baselines in terms of ROC (receiver operating characteristic) and AP (average precision).
引用
收藏
页数:12
相关论文
共 50 条
  • [1] An Outlier Detection Algorithm Based on Spectral Clustering
    Yang, Peng
    Huang, Biao
    PACIIA: 2008 PACIFIC-ASIA WORKSHOP ON COMPUTATIONAL INTELLIGENCE AND INDUSTRIAL APPLICATION, VOLS 1-3, PROCEEDINGS, 2008, : 485 - 488
  • [2] A Spectral Clustering Algorithm for Outlier Detection
    Yang, Peng
    Huang, Biao
    2008 INTERNATIONAL SEMINAR ON FUTURE INFORMATION TECHNOLOGY AND MANAGEMENT ENGINEERING, PROCEEDINGS, 2008, : 33 - 36
  • [3] A new unsupervised clustering method based on outlier information
    Lv, TY
    Wang, ZX
    Zuo, WL
    PROCEEDINGS OF THE 2004 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2004, : 1540 - 1544
  • [4] Two novel outlier detection approaches based on unsupervised possibilistic and fuzzy clustering
    Cebeci Z.
    Cebeci C.
    Tahtali Y.
    Bayyurt L.
    PeerJ Computer Science, 2022, 8
  • [5] Two novel outlier detection approaches based on unsupervised possibilistic and fuzzy clustering
    Cebeci, Zeynel
    Cebeci, Cagatay
    Tahtali, Yalcin
    Bayyurt, Lutfi
    PEERJ COMPUTER SCIENCE, 2022, 8
  • [6] An Outlier Detection Technique Based on Spectral Clustering
    Qiu YuanYuan
    Mao Ting
    Chen YuTing
    Yu Bo
    INTERNATIONAL CONFERENCE ON BIG DATA AND INTERNET OF THINGS (BDIOT 2017), 2017, : 36 - 42
  • [7] A Spectral Clustering Based Outlier Detection Technique
    Wang, Yuan
    Wang, Xiaochun
    Wang, Xia Li
    MACHINE LEARNING AND DATA MINING IN PATTERN RECOGNITION (MLDM 2016), 2016, 9729 : 15 - 27
  • [8] Spectral Clustering Community Detection Algorithm Based on Point-Wise Mutual Information Graph Kernel
    Chen, Yinan
    Ye, Wenbin
    Li, Dong
    ENTROPY, 2023, 25 (12)
  • [9] A novel unsupervised feature-based approach for electricity theft detection using robustPCAand outlier removal clustering algorithm
    Hussain, Saddam
    Mustafa, Mohd Wazir
    Jumani, Touqeer Ahmed
    Baloch, Shadi Khan
    Saeed, Muhammad Salman
    INTERNATIONAL TRANSACTIONS ON ELECTRICAL ENERGY SYSTEMS, 2020, 30 (11)
  • [10] Unsupervised Feature Selection for Outlier Detection in Categorical Data using Mutual Information
    Suri, N. N. R. Ranga
    Murty, M. Narasimha
    Athithan, G.
    2012 12TH INTERNATIONAL CONFERENCE ON HYBRID INTELLIGENT SYSTEMS (HIS), 2012, : 253 - 258