Multiple Kernel Learning With Minority Oversampling for Classifying Imbalanced Data

被引:6
|
作者
Wang, Ling [1 ]
Wang, Hongqiao [1 ]
Fu, Guangyuan [1 ]
机构
[1] Rocket Force Univ Engn, Dept Informat Engn, Xian 710025, Peoples R China
基金
中国国家自然科学基金;
关键词
Training; Sensitivity; Shape; Classification algorithms; Kernel; Task analysis; Standards; Class imbalanced learning; multiple kernel learning; nonlinear oversampling; cost-sensitive;
D O I
10.1109/ACCESS.2020.3046604
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Class imbalance problems, developed due to the sampling bias or measurement error, occur frequently in real-world pattern classification tasks. The traditional classifiers focus on the overall classification accuracy and ignore the minority class, which may degrade the classification performance. However, existing oversampling algorithms generally make specific assumptions to balance the class size and do not sufficiently consider irregularities present in imbalanced data. As a result, these methods can perform well only on certain benchmarks. In this paper, by incorporating minority oversampling and cost-sensitive learning, we propose multiple kernel learning with minority oversampling (MKLMO), for efficiently handling the class imbalance problem with small disjuncts, overlapping, and nonlinear shape. Unlike existing methods where oversampling of the minority class is performed first and then a standard classifier is deployed on the rebalanced data, the proposed MKLMO generates synthetic instances and trains classifier synchronously in the same feature space. Specially, we define a distance metric in the optimal feature space by multiple kernel learning and use kernel trick to expand the original Gram matrix. Moreover, we assign different weights to instances, based on the imbalance ratio, for reducing the bias of the classifier towards the majority class. In order to evaluate the proposed MKLMO method, several experiments are performed with nine artificial and twenty-one real-world datasets. The experimental results show that our algorithm outperforms other baseline algorithms significantly in terms of the assessment metric geometric mean (G-mean), especially in the presence of data irregularities.
引用
收藏
页码:565 / 580
页数:16
相关论文
共 50 条
  • [1] A Novel Synthetic Minority Oversampling Technique for Imbalanced Data Set Learning
    Barua, Sukarna
    Islam, Md. Monirul
    Murase, Kazuyuki
    NEURAL INFORMATION PROCESSING, PT II, 2011, 7063 : 735 - +
  • [2] Oversampling With Reliably Expanding Minority Class Regions for Imbalanced Data Learning
    Zhu, Tuanfei
    Liu, Xinwang
    Zhu, En
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (06) : 6167 - 6181
  • [3] Minority Oversampling in Kernel Adaptive Subspaces for Class Imbalanced Datasets
    Lin, Chin-Teng
    Hsieh, Tsung-Yu
    Liu, Yu-Ting
    Lin, Yang-Yin
    Fang, Chieh-Ning
    Wang, Yu-Kai
    Yen, Gary
    Pal, Nikhil R.
    Chuang, Chun-Hsiang
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2018, 30 (05) : 950 - 962
  • [4] Entropy difference and kernel-based oversampling technique for imbalanced data learning
    Wu, Xu
    Yang, Youlong
    Ren, Lingyu
    INTELLIGENT DATA ANALYSIS, 2020, 24 (06) : 1239 - 1255
  • [5] MWMOTE-Majority Weighted Minority Oversampling Technique for Imbalanced Data Set Learning
    Barua, Sukarna
    Islam, Md. Monirul
    Yao, Xin
    Murase, Kazuyuki
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2014, 26 (02) : 405 - 425
  • [6] An improved and random synthetic minority oversampling technique for imbalanced data
    Wei, Guoliang
    Mu, Weimeng
    Song, Yan
    Dou, Jun
    KNOWLEDGE-BASED SYSTEMS, 2022, 248
  • [7] A minority oversampling approach for fault detection with heterogeneous imbalanced data
    Liu, Jie
    EXPERT SYSTEMS WITH APPLICATIONS, 2021, 184
  • [8] Oversampling technique based on fuzzy representativeness difference for classifying imbalanced data
    Ren, Ruonan
    Yang, Youlong
    Sun, Liqin
    APPLIED INTELLIGENCE, 2020, 50 (08) : 2465 - 2487
  • [9] Oversampling technique based on fuzzy representativeness difference for classifying imbalanced data
    Ruonan Ren
    Youlong Yang
    Liqin Sun
    Applied Intelligence, 2020, 50 : 2465 - 2487
  • [10] Learning class-imbalanced data with region-impurity synthetic minority oversampling technique
    Li, Der -Chiang
    Wang, Ssu-Yang
    Huang, Kuan-Cheng
    Tsai, Tung -, I
    INFORMATION SCIENCES, 2022, 607 : 1391 - 1407