Classification for Imbalanced and Overlapping Classes Using Outlier Detection and Sampling Techniques

被引:8
|
作者
Yang, Zeping [1 ]
Gao, Daqi [1 ]
机构
[1] E China Univ Sci & Technol, Sch Informat Sci & Engn, Shanghai 200237, Peoples R China
关键词
Under-sampling; outlier detection; overlapping; imbalanced data; artificial neural network (ANN);
D O I
10.12785/amis/071L50
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
In many real world applications, the example data among different pattern classes are imbalanced and overlapping, which hinder the classification performance of many learning algorithms In this paper, data cleaning techniques based BNF (the borderline noise factor) is proposed to remove the borderline noise and three under-sampling methods are studied to select the representative majority class examples and remove the distant samples which are useless to form the decision boundary. BNF shows the degree of being a borderline noise and the outlier detection algorithm is improved to clean the whole dataset. Here G-mean (Geometric Mean) is used to define the threshold, which can improve the classification accuracy of minority classes while achieving better performance on the overall classification. The experimental results demonstrate the effectiveness of sampling method with data cleaning techniques based on BNF.
引用
收藏
页码:375 / 381
页数:7
相关论文
共 50 条
  • [1] Classification of imbalanced and overlapping classes using intuitionistic fuzzy sets
    Szmidt, Eulalia
    Kukier, Marta
    2006 3RD INTERNATIONAL IEEE CONFERENCE INTELLIGENT SYSTEMS, VOLS 1 AND 2, 2006, : 708 - 713
  • [2] Unbalanced data classification using extreme outlier elimination and sampling techniques for fraud detection
    Padmaja, T. Maruthi
    Dhulipalla, Narendra
    Bapi, Raju S.
    Krishna, P. Radha
    ADCOM 2007: PROCEEDINGS OF THE 15TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING AND COMMUNICATIONS, 2007, : 511 - +
  • [3] Atanassov's Intuitionistic Fuzzy Sets in Classification of Imbalanced and Overlapping Classes
    Szmidt, Eulalia
    Kukier, Marta
    INTELLIGENT TECHNIQUES AND TOOLS FOR NOVEL SYSTEM ARCHITECTURES, 2008, 109 : 455 - 471
  • [4] Exploring Data Sampling Techniques for Imbalanced Classification Problems
    Sui, Yu
    Zhang, Xiaohui
    Huan, Jiajia
    Hong, Haifeng
    FOURTH INTERNATIONAL WORKSHOP ON PATTERN RECOGNITION, 2019, 11198
  • [5] Experimental Comparison of Sampling Techniques for Imbalanced Datasets Using Various Classification Models
    Pattanayak, Sanjibani Sudha
    Rout, Minakhi
    PROGRESS IN ADVANCED COMPUTING AND INTELLIGENT ENGINEERING, VOL 2, 2018, 564 : 13 - 22
  • [6] Label Propagation Techniques for Artifact Detection in Imbalanced Classes Using Photoplethysmogram Signals
    Macabiau, Clara
    Le, Thanh-Dung
    Albert, Kevin
    Shahriari, Mana
    Jouvet, Philippe
    Noumeir, Rita
    IEEE ACCESS, 2024, 12 : 81221 - 81235
  • [7] Using sub-sampling and ensemble clustering techniques to improve performance of imbalanced classification
    Nejatian, Samad
    Parvin, Hamid
    Faraji, Eshagh
    NEUROCOMPUTING, 2018, 276 : 55 - 66
  • [8] Classification of Imbalanced Classes using the Committee of Neural Networks
    Doroshenko, Anastasiya
    Tkachenko, Roman
    2018 IEEE 13TH INTERNATIONAL SCIENTIFIC AND TECHNICAL CONFERENCE ON COMPUTER SCIENCES AND INFORMATION TECHNOLOGIES (CSIT), VOL 1, 2018, : 400 - 403
  • [9] A Survey of Outlier Detection Techniques in IoT: Review and Classification
    Al Samara, Mustafa
    Bennis, Ismail
    Abouaissa, Abdelhafid
    Lorenz, Pascal
    JOURNAL OF SENSOR AND ACTUATOR NETWORKS, 2022, 11 (01)
  • [10] An overlapping minimization-based over-sampling algorithm for binary imbalanced classification
    Lu, Xuan
    Ye, Xuan
    Cheng, Yingchao
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 133