Classification for Imbalanced and Overlapping Classes Using Outlier Detection and Sampling Techniques

被引:8
|
作者
Yang, Zeping [1 ]
Gao, Daqi [1 ]
机构
[1] E China Univ Sci & Technol, Sch Informat Sci & Engn, Shanghai 200237, Peoples R China
关键词
Under-sampling; outlier detection; overlapping; imbalanced data; artificial neural network (ANN);
D O I
10.12785/amis/071L50
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
In many real world applications, the example data among different pattern classes are imbalanced and overlapping, which hinder the classification performance of many learning algorithms In this paper, data cleaning techniques based BNF (the borderline noise factor) is proposed to remove the borderline noise and three under-sampling methods are studied to select the representative majority class examples and remove the distant samples which are useless to form the decision boundary. BNF shows the degree of being a borderline noise and the outlier detection algorithm is improved to clean the whole dataset. Here G-mean (Geometric Mean) is used to define the threshold, which can improve the classification accuracy of minority classes while achieving better performance on the overall classification. The experimental results demonstrate the effectiveness of sampling method with data cleaning techniques based on BNF.
引用
收藏
页码:375 / 381
页数:7
相关论文
共 50 条
  • [31] Improving imbalanced scientific text classification using sampling strategies and dictionaries
    Borrajo, L.
    Romero, R.
    Iglesias, E. L.
    Redondo Marey, C. M.
    JOURNAL OF INTEGRATIVE BIOINFORMATICS, 2011, 8 (03)
  • [32] Damage Detection in Structures by Using Imbalanced Classification Algorithms
    Moghadam, Kasra Yousefi
    Noori, Mohammad
    Silik, Ahmed
    Altabey, Wael A.
    MATHEMATICS, 2024, 12 (03)
  • [33] Solving imbalanced learning with outlier detection and features reduction
    Lusito, Salvatore
    Pugnana, Andrea
    Guidotti, Riccardo
    MACHINE LEARNING, 2024, 113 (08) : 5273 - 5330
  • [34] Outlier detection by using fault detection and isolation techniques in geodetic networks
    Durdag, U. M.
    Hekimoglu, S.
    Erdogan, B.
    SURVEY REVIEW, 2016, 48 (351) : 400 - 408
  • [35] Outlier Detection: Applications And Techniques
    HQ Base Workshop Group EME, Meerut Cantt, UP, India
    不详
    Int. J. Comput. Sci. Issues, 1 1-3 (307-323):
  • [36] Outlier Detection via Sampling Ensemble
    Liu, Hongfu
    Zhang, Yuchao
    Deng, Bo
    Fu, Yun
    2016 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2016, : 726 - 735
  • [37] Chain based sampling for monotonic imbalanced classification
    Gonzalez, Sergio
    Garcia, Salvador
    Li, Sheng-Tun
    Herrera, Francisco
    INFORMATION SCIENCES, 2019, 474 : 187 - 204
  • [38] An Evolutionary Sampling Approach for Classification with Imbalanced Data
    Fernandes, Everlandio R. Q.
    de Carvalho, Andre C. P. L. F.
    Coelho, Andre L. V.
    2015 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2015,
  • [39] An Imbalanced Classification Method Based on Adaptive Sampling
    Chen Q.
    Xie J.
    Huanan Ligong Daxue Xuebao/Journal of South China University of Technology (Natural Science), 2022, 50 (04): : 26 - 34and45
  • [40] Granular Ball Sampling for Noisy Label Classification or Imbalanced Classification
    Xia, Shuyin
    Zheng, Shaoyuan
    Wang, Guoyin
    Gao, Xinbo
    Wang, Binggui
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (04) : 2144 - 2155