Classification for Imbalanced and Overlapping Classes Using Outlier Detection and Sampling Techniques

被引:8
|
作者
Yang, Zeping [1 ]
Gao, Daqi [1 ]
机构
[1] E China Univ Sci & Technol, Sch Informat Sci & Engn, Shanghai 200237, Peoples R China
关键词
Under-sampling; outlier detection; overlapping; imbalanced data; artificial neural network (ANN);
D O I
10.12785/amis/071L50
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
In many real world applications, the example data among different pattern classes are imbalanced and overlapping, which hinder the classification performance of many learning algorithms In this paper, data cleaning techniques based BNF (the borderline noise factor) is proposed to remove the borderline noise and three under-sampling methods are studied to select the representative majority class examples and remove the distant samples which are useless to form the decision boundary. BNF shows the degree of being a borderline noise and the outlier detection algorithm is improved to clean the whole dataset. Here G-mean (Geometric Mean) is used to define the threshold, which can improve the classification accuracy of minority classes while achieving better performance on the overall classification. The experimental results demonstrate the effectiveness of sampling method with data cleaning techniques based on BNF.
引用
收藏
页码:375 / 381
页数:7
相关论文
共 50 条
  • [21] Imbalanced data classification: Using transfer learning and active sampling
    Liu, Yang
    Yang, Guoping
    Qiao, Shaojie
    Liu, Meiqi
    Qu, Lulu
    Han, Nan
    Wu, Tao
    Yuan, Guan
    Peng, Yuzhong
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 117
  • [22] Android App Behaviour Classification Using Topic Modeling Techniques and Outlier detection using App Permissions
    Garg, Mayank
    Monga, Akshit
    Bhatt, Privank
    Arora, Anuja
    2016 FOURTH INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED AND GRID COMPUTING (PDGC), 2016, : 500 - 506
  • [23] Performance of outlier detection techniques based classification in Wireless Sensor Networks
    Ayadi, Aya
    Ghorbel, Oussama
    Bensaleh, M. S.
    Obeid, Abdelfateh
    Abid, Mohamed
    2017 13TH INTERNATIONAL WIRELESS COMMUNICATIONS AND MOBILE COMPUTING CONFERENCE (IWCMC), 2017, : 687 - 692
  • [24] Matrix sketching for supervised classification with imbalanced classes
    Roberta Falcone
    Laura Anderlucci
    Angela Montanari
    Data Mining and Knowledge Discovery, 2022, 36 : 174 - 208
  • [25] Analysis of Sampling Techniques Towards Epileptic Seizure Detection from Imbalanced Dataset
    Masum, Mohammad
    Shahriar, Hossain
    Haddad, Hisham
    2020 IEEE 44TH ANNUAL COMPUTERS, SOFTWARE, AND APPLICATIONS CONFERENCE (COMPSAC 2020), 2020, : 684 - 692
  • [26] Matrix sketching for supervised classification with imbalanced classes
    Falcone, Roberta
    Anderlucci, Laura
    Montanari, Angela
    DATA MINING AND KNOWLEDGE DISCOVERY, 2022, 36 (01) : 174 - 208
  • [27] A Comparison of Re-sampling Techniques for Pattern Classification in Imbalanced Data-Sets
    Saul, Marcia Amstelvina
    Rostami, Shahin
    ADVANCES IN COMPUTATIONAL INTELLIGENCE SYSTEMS (UKCI), 2019, 840 : 240 - 251
  • [28] A novel overlapping minimization SMOTE algorithm for imbalanced classification
    He, Yulin
    Lu, Xuan
    Fournier-Viger, Philippe
    Huang, Joshua Zhexue
    FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING, 2024, 25 (09) : 1266 - 1281
  • [29] CLASSIFICATION OF IMBALANCED HYPERSPECTRAL IMAGERY DATA USING SUPPORT VECTOR SAMPLING
    Zhang, Xiangrong
    Song, Qiang
    Zheng, Yaoguo
    Hou, Biao
    Gou, Shuiping
    2014 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS), 2014,
  • [30] EHSO: Evolutionary Hybrid Sampling in overlapping scenarios for imbalanced learning
    Zhu, Yuanwei
    Yan, Yuanting
    Zhang, Yiwen
    Zhang, Yanping
    NEUROCOMPUTING, 2020, 417 : 333 - 346