Self-paced ensemble and big data identification: a classification of substantial imbalance computational analysis

被引:0
|
作者
Bano, Shahzadi [1 ]
Zhi, Weimei [1 ]
Qiu, Baozhi [1 ]
Raza, Muhammad [2 ]
Sehito, Nabila [3 ]
Kamal, Mian Muhammad [4 ]
Aldehim, Ghadah [5 ]
Alruwais, Nuha [6 ]
机构
[1] Zhengzhou Univ, Sch Comp & Artificial Intelligence, 100 Sci Ave, Zhengzhou 450001, Peoples R China
[2] Xian Technol Univ, Xian, Peoples R China
[3] Zhengzhou Univ, Sch Elect Informat Engn, 100 Sci Ave, Zhengzhou 450001, Henan, Peoples R China
[4] Southeast Univ, Sch Elect Sci & Engn, Joint Int Res Lab Informat Display & Visualizat, Nanjing 210018, Peoples R China
[5] Princess Nourah Bint Abdulrahman Univ, Coll Comp & Informat Sci, Dept Informat Syst, POB 84428, Riyadh 11671, Saudi Arabia
[6] King Saud Univ, Coll Appl Studies & Community Serv, Dept Comp Sci & Engn, POB 22459, Riyadh 11495, Saudi Arabia
来源
JOURNAL OF SUPERCOMPUTING | 2024年 / 80卷 / 07期
关键词
Self-paced ensemble; Big data; Classification; Computational; Simulation; Substantial imbalance;
D O I
10.1007/s11227-023-05828-6
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
This research paper focuses on the challenges associated with learning classifiers from large-scale, highly imbalanced datasets prevalent in many real-world applications. Traditional algorithms learning often need better performance and high computational efficiency when dealing with imbalanced data. Factors such as class imbalance, noise, and class overlap make it demanding to learn effective classifiers. In this study, we propose a novel self-paced ensemble framework for classifying imbalanced data. The framework employs under-sampling to self-harmonize data hardness and build a robust ensemble. Extensive experimental testing demonstrates promising results in handling overlapping classes and skewed distributions while maintaining computational efficiency. The self-paced ensemble method addresses the challenges of high imbalance ratios, class overlap, and noise presence in large-scale imbalanced classification problems. By incorporating the knowledge of these challenges into our learning framework, we establish the concept of classification hardness distribution, and the self-paced ensemble is a revolutionary learning paradigm for massive imbalance categorization, capable of improving the performance of existing learning algorithms on imbalanced data and providing better results for future applications.
引用
收藏
页码:9848 / 9869
页数:22
相关论文
共 50 条
  • [11] Continuous EEG Classification for a Self-paced BCI
    Satti, Abdul
    Coyle, Damien
    Prasad, Girijesh
    2009 4TH INTERNATIONAL IEEE/EMBS CONFERENCE ON NEURAL ENGINEERING, 2009, : 308 - +
  • [12] Supervised Image Classification with Self-paced Regularization
    Zhang, Tao
    Gong, Chen
    Jia, Wenjing
    Song, Xiaoning
    Sun, Jun
    Wu, Xiaojun
    2018 18TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW), 2018, : 411 - 414
  • [13] SELF-PACED PROBABILISTIC PRINCIPAL COMPONENT ANALYSIS FOR DATA WITH OUTLIERS
    Zhao, Bowen
    Xiao, Xi
    Zhang, Wanpeng
    Zhang, Bin
    Gan, Guojun
    Xia, Shutao
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 3737 - 3741
  • [14] Data augmentation for self-paced motor imagery classification with C-LSTM
    Freer, Daniel
    Yang, Guang-Zhong
    JOURNAL OF NEURAL ENGINEERING, 2020, 17 (01)
  • [15] Self-paced principal component analysis
    Kang, Zhao
    Liu, Hongfei
    Li, Jiangxin
    Zhu, Xiaofeng
    Tian, Ling
    PATTERN RECOGNITION, 2023, 142
  • [16] Ensemble Self-Paced Learning Based on Adaptive Mixture Weighting
    Liu, Liwen
    Wang, Zhong
    Bai, Jianbin
    Yang, Xiangfeng
    Yang, Yunchuan
    Zhou, Jianbo
    ELECTRONICS, 2022, 11 (19)
  • [17] Dual Self-Paced SMOTE for Imbalanced Data
    Shao, Yangguang
    Sun, Yingying
    Guan, Hongjiao
    2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 3083 - 3089
  • [18] Self-Paced Ensemble-SHAP Approach for the Classification and Interpretation of Crash Severity in Work Zone Areas
    Asadi, Roksana
    Khattak, Afaq
    Vashani, Hossein
    Almujibah, Hamad R.
    Rabie, Helia
    Asadi, Seyedamirhossein
    Dimitrijevic, Branislav
    SUSTAINABILITY, 2023, 15 (11)
  • [19] SPE$∧{2}$: Self-Paced Ensemble of Ensembles for Software Defect Prediction
    Wan, Xiaohui
    Zheng, Zheng
    Liu, Yang
    IEEE TRANSACTIONS ON RELIABILITY, 2022, 71 (02) : 865 - 879
  • [20] SELF-PACED LEARNING WITH SUPERPIXELWISE FEATURES FOR HYPERSPECTRAL IMAGE CLASSIFICATION
    Tai, Xiaoxiao
    Wang, Guangxing
    Han, Lirong
    Zhang, Xiaoyu
    Ren, Peng
    IGARSS 2020 - 2020 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2020, : 60 - 63