Feature Enhanced Ensemble Modeling With Voting Optimization for Credit Risk Assessment

被引:3
|
作者
Yang, Dongqi [1 ]
Xiao, Binqing [1 ]
机构
[1] Nanjing Univ, Sch Management & Engn, Nanjing 210008, Peoples R China
来源
IEEE ACCESS | 2024年 / 12卷
基金
中国国家自然科学基金;
关键词
Risk management; Predictive models; Data models; Adaptation models; Accuracy; Training; Soft sensors; Credit risk; ensemble modeling; feature enhancement; model interpretability; voting optimization; PERFORMANCE; PREDICTION;
D O I
10.1109/ACCESS.2024.3445499
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Machine learning methods have gained widespread utilization in small and micro enterprise credit risk assessment. However, the practical application of these methods encounters a conundrum involving accuracy and interpretability. In this study, a multi-stage ensemble model is proposed to enhance the model's interpretability. To strengthen predictive portraits, a multi-feature enhancement method is proposed to integrate non-financial behavioral information and soft information on credit rating into the annual loan ledger data, thereby bolstering the explanatory capacity of the features. To rectify the issue of data imbalance and avoid information loss, a new bagging-based oversampling method is proposed to oversample the minority class samples in multiple parallelized subsets divided by the bagging strategy. To unleash the performance potential of base classifiers, a new voting-weight optimization method is proposed to optimize the soft voting weights of the candidate base classifiers. The experiment results of an annual loan ledger dataset of a commercial bank in China (with an accuracy of 97.9%, an area under the curve of 0.97, a logistic loss of 0.07, a Brier score of 0.01, and a Kolmogorov-Smirnov statistic of 0.38) and the other five public datasets indicating excellent model fit. By focusing on the widespread soft information and data structures characteristic of SME loan risk assessment data, an additional SHAP model explanation method enhances interpretability. This method reveals that the enhanced 'debt-to-income ratio,' along with non-financial behavioral information and features derived from soft information, are essential for predicting loan defaults. Such enhancements help to alleviate the issue of information asymmetry in SME loan risk assessment.
引用
收藏
页码:115124 / 115136
页数:13
相关论文
共 50 条
  • [41] An ensemble learning model with dynamic sampling and feature fusion network for class sparsity in credit risk classification
    He, Changhua
    Yu, Lean
    Xi, Xi
    Zhang, Xiaoming
    Liu, Chuanbin
    ANNALS OF OPERATIONS RESEARCH, 2025,
  • [42] Enhanced Protein Structural Class Prediction Using Effective Feature Modeling and Ensemble of Classifiers
    Bankapur, Sanjay
    Patil, Nagamma
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2021, 18 (06) : 2409 - 2419
  • [43] Assessment of Voting Ensemble for Estimating Software Development Effort
    Elish, Mahmoud O.
    2013 IEEE SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DATA MINING (CIDM), 2013, : 316 - 321
  • [44] XGBoost-B-GHM: An Ensemble Model with Feature Selection and GHM Loss Function Optimization for Credit Scoring
    Xia, Yuxuan
    Jiang, Shanshan
    Meng, Lingyi
    Ju, Xin
    SYSTEMS, 2024, 12 (07):
  • [45] Credit Risk Assessment Modeling Method Based on Fuzzy Integral and SVM
    Zhou, Mingyi
    MOBILE INFORMATION SYSTEMS, 2022, 2022
  • [46] Default Feature Selection in Credit Risk Modeling: Evidence From Chinese Small Enterprises
    Chai, Nana
    Shi, Baofeng
    Meng, Bin
    Dong, Yizhe
    SAGE OPEN, 2023, 13 (02):
  • [47] Combining B&B-based hybrid feature selection and the imbalance-oriented multiple-classifier ensemble for imbalanced credit risk assessment
    Sun, Jie
    Lee, Young-Chan
    Li, Hui
    Huang, Qing-Hua
    TECHNOLOGICAL AND ECONOMIC DEVELOPMENT OF ECONOMY, 2015, 21 (03) : 351 - 378
  • [48] A Hybrid Technological Innovation Text Mining, Ensemble Learning and Risk Scorecard Approach for Enterprise Credit Risk Assessment
    Mao, Yang
    Liu, Shifeng
    Gong, Daqing
    TEHNICKI VJESNIK-TECHNICAL GAZETTE, 2023, 30 (06): : 1692 - 1703
  • [49] A novel hybrid credit scoring model based on ensemble feature selection and multilayer ensemble classification
    Tripathi, Diwakar
    Edla, Damodar Reddy
    Cheruku, Ramalingaswamy
    Kuppili, Venkatanareshbabu
    COMPUTATIONAL INTELLIGENCE, 2019, 35 (02) : 371 - 394
  • [50] An introduction to Credit Risk Modeling
    Ali, P
    JOURNAL OF THE OPERATIONAL RESEARCH SOCIETY, 2005, 56 (12) : 1453 - 1453