A weighted hybrid ensemble method for classifying imbalanced data

被引:0
|
作者
Zhao, Jiakun [1 ]
Jin, Ju [1 ]
Chen, Si [1 ]
Zhang, Ruifeng [1 ]
Yu, Bilin [2 ]
Liu, Qingfang [3 ]
机构
[1] School of Software Engineering, Xi'an Jiaotong University, 710049, China
[2] School of Management, University of Science and Technology of China, 230026, China
[3] School of Mathematics and Statistics, Xi'an Jiaotong University, 710049, China
基金
中国国家自然科学基金;
关键词
Benchmarking - Data mining - Classification (of information);
D O I
暂无
中图分类号
学科分类号
摘要
In real datasets, most are unbalanced. Data imbalance can be defined as the number of instances in some classes greatly exceeds the number of instances in other classes. Whether in the field of data mining or machine learning, data imbalance can have adverse effects. At present, the methods to solve the problem of data imbalance can be divided into data-level methods, algorithm-level methods and hybrid methods. In this paper, we propose a weighted hybrid ensemble method for classifying imbalanced data in binary classification tasks, called WHMBoost. In the framework of the boosting algorithm, the presented method combines two data sampling methods and two base classifiers, and each sampling method and each base classifier is assigned corresponding weights, which makes them have better complementary advantages. The performance of WHMBoost has been evaluated on 40 benchmark imbalanced datasets with state of the art ensemble methods like AdaBoost, RUSBoost, SMOTEBoost using AUC, F-Measure and Geometric Mean as the performance evaluation criteria. Experimental results show significant improvement over the other methods and it can be concluded that WHMBoost is a promising and effective algorithm to deal with imbalance datasets. © 2020 Elsevier B.V.
引用
收藏
相关论文
共 50 条
  • [1] A weighted hybrid ensemble method for classifying imbalanced data
    Zhao, Jiakun
    Jin, Ju
    Chen, Si
    Zhang, Ruifeng
    Yu, Bilin
    Liu, Qingfang
    KNOWLEDGE-BASED SYSTEMS, 2020, 203
  • [2] A novel ensemble method for classifying imbalanced data
    Sun, Zhongbin
    Song, Qinbao
    Zhu, Xiaoyan
    Sun, Heli
    Xu, Baowen
    Zhou, Yuming
    PATTERN RECOGNITION, 2015, 48 (05) : 1623 - 1637
  • [3] Adaptive Ensemble Method Based on Spatial Characteristics for Classifying Imbalanced Data
    Wang, Lei
    Zhao, Lei
    Gui, Guan
    Zheng, Baoyu
    Huang, Ruochen
    SCIENTIFIC PROGRAMMING, 2017, 2017
  • [4] Classifying imbalanced data using ensemble of reduced kernelized weighted extreme learning machine
    Bhagat Singh Raghuwanshi
    Sanyam Shukla
    International Journal of Machine Learning and Cybernetics, 2019, 10 : 3071 - 3097
  • [5] Classifying imbalanced data using ensemble of reduced kernelized weighted extreme learning machine
    Raghuwanshi, Bhagat Singh
    Shukla, Sanyam
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2019, 10 (11) : 3071 - 3097
  • [6] EmSM: Ensemble Mixed Sampling Method for Classifying Imbalanced Intrusion Detection Data
    Jung, Ilok
    Ji, Jaewon
    Cho, Changseob
    ELECTRONICS, 2022, 11 (09)
  • [7] Hybrid Classifier Ensemble for Imbalanced Data
    Yang, Kaixiang
    Yu, Zhiwen
    Wen, Xin
    Cao, Wenming
    Chen, C. L. Philip
    Wong, Hau-San
    You, Jane
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (04) : 1387 - 1400
  • [8] A new sampling method for classifying imbalanced data based on support vector machine ensemble
    Jian, Chuanxia
    Gao, Jian
    Ao, Yinhui
    NEUROCOMPUTING, 2016, 193 : 115 - 122
  • [9] An Improved Ensemble Learning Method for Classifying High-Dimensional and Imbalanced Biomedicine Data
    Yu, Hualong
    Ni, Jun
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2014, 11 (04) : 657 - 666
  • [10] A selective evolutionary heterogeneous ensemble algorithm for classifying imbalanced data
    An, Xiaomeng
    Xu, Sen
    ELECTRONIC RESEARCH ARCHIVE, 2023, 31 (05): : 2733 - 2757