Multiple optimized ensemble learning for high-dimensional imbalanced credit scoring datasets

被引:1
|
作者
Lenka, Sudhansu R. [1 ,2 ]
Bisoy, Sukant Kishoro [1 ]
Priyadarshini, Rojalina [1 ]
机构
[1] CV Raman Global Univ, Dept CSE, Bhubaneswar, India
[2] Trident Acad Technol, Bhubaneswar, India
关键词
Class imbalanced data; Optimization subset; Feature selection; Ensemble learning; Credit scoring; Resampling; DECISION TREE; RISK; SMOTE; PERFORMANCE; ALGORITHM;
D O I
10.1007/s10115-024-02129-z
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Credit scoring models are crucial tools for lenders to assess credit risks. Researchers from academia and the financial industry have shown intense interest in these models. However, real credit datasets often have high dimensionality and class imbalance, making it challenging to develop accurate and effective credit scoring models. To address these challenges, a new approach called the Multiple-Optimized Ensemble Learning (MOEL) method has been proposed. In MOEL, a technique called Multiple Diverse Optimized Subsets (MDOS) generates multiple diverse optimized subsets from various weighted random forests. From each subset, more effective and relevant features are selected. Then, a new evaluation measure is applied to each subset to determine the more optimized subsets. These subsets are applied to a novel Mahalanobis-based oversampling (MOS) technique to provide balanced subsets for the base classifier, which lessens the detrimental effects of imbalanced datasets. Finally, a stacking-based ensemble method is applied to the balanced subsets for integration of the base models. The proposed model was evaluated against six high-dimensional imbalanced credit scoring datasets, and it outperformed state-of-the-art methods, exhibiting a mean rank of 1.5 and 1.333 in terms of F1_score and G-mean, respectively.
引用
收藏
页码:5429 / 5457
页数:29
相关论文
共 50 条
  • [21] A novel deep ensemble model for imbalanced credit scoring in internet finance
    Xiao, Jin
    Zhong, Yu
    Jia, Yanlin
    Wang, Yadong
    Li, Ruoyi
    Jiang, Xiaoyi
    Wang, Shouyang
    INTERNATIONAL JOURNAL OF FORECASTING, 2024, 40 (01) : 348 - 372
  • [22] An oversampling algorithm for high-dimensional imbalanced learning with class overlapping
    Yang, Xu
    Xue, Zhen
    Zhang, Liangliang
    Wu, Jianzhen
    KNOWLEDGE AND INFORMATION SYSTEMS, 2025, 67 (02) : 1915 - 1943
  • [23] Adaptive Subspace Optimization Ensemble Method for High-Dimensional Imbalanced Data Classification
    Xu, Yuhong
    Yu, Zhiwen
    Chen, C. L. Philip
    Liu, Zhulin
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (05) : 2284 - 2297
  • [24] Classifier Ensemble Based on Multiview Optimization for High-Dimensional Imbalanced Data Classification
    Xu, Yuhong
    Yu, Zhiwen
    Chen, C. L. Philip
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (01) : 870 - 883
  • [25] A-RDBOTE: an improved oversampling technique for imbalanced credit-scoring datasets
    Sudhansu R. Lenka
    Sukant Kishoro Bisoy
    Rojalina Priyadarshini
    Risk Management, 2023, 25
  • [26] A-RDBOTE: an improved oversampling technique for imbalanced credit-scoring datasets
    Lenka, Sudhansu R.
    Bisoy, Sukant Kishoro
    Priyadarshini, Rojalina
    RISK MANAGEMENT-AN INTERNATIONAL JOURNAL, 2023, 25 (04):
  • [27] Credit scoring using ensemble machine learning
    Yao, Ping
    HIS 2009: 2009 NINTH INTERNATIONAL CONFERENCE ON HYBRID INTELLIGENT SYSTEMS, VOL 3, PROCEEDINGS, 2009, : 244 - 246
  • [28] A comparative assessment of ensemble learning for credit scoring
    Wang, Gang
    Hao, Jinxing
    Ma, Jian
    Jiang, Hongbing
    EXPERT SYSTEMS WITH APPLICATIONS, 2011, 38 (01) : 223 - 230
  • [29] Research on classification method of high-dimensional class-imbalanced datasets based on SVM
    Chunkai Zhang
    Ying Zhou
    Jianwei Guo
    Guoquan Wang
    Xuan Wang
    International Journal of Machine Learning and Cybernetics, 2019, 10 : 1765 - 1778
  • [30] Research on classification method of high-dimensional class-imbalanced datasets based on SVM
    Zhang, Chunkai
    Zhou, Ying
    Guo, Jianwei
    Wang, Guoquan
    Wang, Xuan
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2019, 10 (07) : 1765 - 1778