Classification for Imbalanced and Overlapping Classes Using Outlier Detection and Sampling Techniques

被引:8
|
作者
Yang, Zeping [1 ]
Gao, Daqi [1 ]
机构
[1] E China Univ Sci & Technol, Sch Informat Sci & Engn, Shanghai 200237, Peoples R China
关键词
Under-sampling; outlier detection; overlapping; imbalanced data; artificial neural network (ANN);
D O I
10.12785/amis/071L50
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
In many real world applications, the example data among different pattern classes are imbalanced and overlapping, which hinder the classification performance of many learning algorithms In this paper, data cleaning techniques based BNF (the borderline noise factor) is proposed to remove the borderline noise and three under-sampling methods are studied to select the representative majority class examples and remove the distant samples which are useless to form the decision boundary. BNF shows the degree of being a borderline noise and the outlier detection algorithm is improved to clean the whole dataset. Here G-mean (Geometric Mean) is used to define the threshold, which can improve the classification accuracy of minority classes while achieving better performance on the overall classification. The experimental results demonstrate the effectiveness of sampling method with data cleaning techniques based on BNF.
引用
收藏
页码:375 / 381
页数:7
相关论文
共 50 条
  • [41] Outlier Detection for Analog Tests Using Deep Learning Techniques
    Lin, Chin-Kuan
    Lu, Cheng-Che
    Chang, Shuo-Wen
    Chu, Ying-Hua
    Wu, Kai-Chiang
    Chao, Mango Chia-Tso
    2023 IEEE 41ST VLSI TEST SYMPOSIUM, VTS, 2023,
  • [42] Substantiating Security Threats Using Group Outlier Detection Techniques
    Sithirasenan, Elankayer
    Muthukkumarasamy, Vallipuram
    GLOBECOM 2008 - 2008 IEEE GLOBAL TELECOMMUNICATIONS CONFERENCE, 2008,
  • [43] A Uniform Performance Index for Ordinal Classification with Imbalanced Classes
    Silva, Wilson
    Pinto, Joao Ribeiro
    Cardoso, Jaime S.
    2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2018,
  • [44] An automatic sampling ratio detection method based on genetic algorithm for imbalanced data classification
    Zheng, Ming
    Li, Tong
    Sun, Liping
    Wang, Taochun
    Jie, Biao
    Yang, Weiyi
    Tang, Mingjing
    Lv, Changlong
    KNOWLEDGE-BASED SYSTEMS, 2021, 216 (216)
  • [45] Two stage partial classification for inconsistent and imbalanced classes
    Bedingfield, Susan
    Smith-Miles, Kate
    2006 INTERNATIONAL CONFERENCE ON INFORMATION AND AUTOMATION, 2007, : 167 - +
  • [46] Taking advantage of the web for text classification with imbalanced classes
    Guzman-Cabrera, Rafael
    Montes-Y-Gomez, Manuel
    Rosso, Paolo
    Villasenor-Pineda, Luis
    MICAI 2007: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2007, 4827 : 831 - +
  • [47] Improving Deep Learning Performance Using Sampling Techniques for IoT Imbalanced Data
    El Hariri, Ayyoub
    Mouiti, Mohamed
    Habibi, Omar
    Lazaar, Mohamed
    18TH INTERNATIONAL CONFERENCE ON FUTURE NETWORKS AND COMMUNICATIONS, FNC 2023/20TH INTERNATIONAL CONFERENCE ON MOBILE SYSTEMS AND PERVASIVE COMPUTING, MOBISPC 2023/13TH INTERNATIONAL CONFERENCE ON SUSTAINABLE ENERGY INFORMATION TECHNOLOGY, SEIT 2023, 2023, 224 : 180 - 187
  • [48] Two density-based sampling approaches for imbalanced and overlapping data
    Mayabadi, Sima
    Saadatfar, Hamid
    KNOWLEDGE-BASED SYSTEMS, 2022, 241
  • [49] An Imbalanced Dataset and Class Overlapping Classification Model for Big Data
    Prince, Mini
    Prathap, P. M. Joe
    COMPUTER SYSTEMS SCIENCE AND ENGINEERING, 2023, 44 (02): : 1009 - 1024
  • [50] Three-Way Hybrid Sampling Using Granular Balls for Imbalanced Classification
    Xie, Qin
    Zhang, Qinghua
    Luo, Nanfang
    Wang, Guoyin
    ROUGH SETS, PT II, IJCRS 2024, 2024, 14840 : 86 - 102