Feature Enhanced Ensemble Modeling With Voting Optimization for Credit Risk Assessment

被引:3
|
作者
Yang, Dongqi [1 ]
Xiao, Binqing [1 ]
机构
[1] Nanjing Univ, Sch Management & Engn, Nanjing 210008, Peoples R China
来源
IEEE ACCESS | 2024年 / 12卷
基金
中国国家自然科学基金;
关键词
Risk management; Predictive models; Data models; Adaptation models; Accuracy; Training; Soft sensors; Credit risk; ensemble modeling; feature enhancement; model interpretability; voting optimization; PERFORMANCE; PREDICTION;
D O I
10.1109/ACCESS.2024.3445499
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Machine learning methods have gained widespread utilization in small and micro enterprise credit risk assessment. However, the practical application of these methods encounters a conundrum involving accuracy and interpretability. In this study, a multi-stage ensemble model is proposed to enhance the model's interpretability. To strengthen predictive portraits, a multi-feature enhancement method is proposed to integrate non-financial behavioral information and soft information on credit rating into the annual loan ledger data, thereby bolstering the explanatory capacity of the features. To rectify the issue of data imbalance and avoid information loss, a new bagging-based oversampling method is proposed to oversample the minority class samples in multiple parallelized subsets divided by the bagging strategy. To unleash the performance potential of base classifiers, a new voting-weight optimization method is proposed to optimize the soft voting weights of the candidate base classifiers. The experiment results of an annual loan ledger dataset of a commercial bank in China (with an accuracy of 97.9%, an area under the curve of 0.97, a logistic loss of 0.07, a Brier score of 0.01, and a Kolmogorov-Smirnov statistic of 0.38) and the other five public datasets indicating excellent model fit. By focusing on the widespread soft information and data structures characteristic of SME loan risk assessment data, an additional SHAP model explanation method enhances interpretability. This method reveals that the enhanced 'debt-to-income ratio,' along with non-financial behavioral information and features derived from soft information, are essential for predicting loan defaults. Such enhancements help to alleviate the issue of information asymmetry in SME loan risk assessment.
引用
收藏
页码:115124 / 115136
页数:13
相关论文
共 50 条
  • [21] Hybridisation of Feature Selection and Classification Techniques in Credit Risk Assessment Modelling
    Sakri, Sapiah
    Othman, Jaizah
    Halid, Noreha
    KNOWLEDGE INNOVATION THROUGH INTELLIGENT SOFTWARE METHODOLOGIES, TOOLS AND TECHNIQUES (SOMET_20), 2020, 327 : 367 - 380
  • [22] Association Rule-based Feature Selection for Credit Risk Assessment
    Mei, Xueyan
    Jiang, Yilin
    2016 IEEE INTERNATIONAL CONFERENCE OF ONLINE ANALYSIS AND COMPUTING SCIENCE (ICOACS), 2016, : 301 - 305
  • [23] The Most Effective Strategy for Incorporating Feature Selection into Credit Risk Assessment
    Atif D.
    Salmi M.
    SN Computer Science, 4 (2)
  • [24] Hybridisation of feature selection and classification techniques in credit risk assessment modelling
    Sakri, Sapiah
    Othman, Jaizah
    Halid, Noreha
    Frontiers in Artificial Intelligence and Applications, 2020, 327 : 367 - 380
  • [25] Credit risk assessment using the factorization machine model with feature interactions
    Quan, Jing
    Sun, Xuelian
    HUMANITIES & SOCIAL SCIENCES COMMUNICATIONS, 2024, 11 (01):
  • [26] Novel Classification Approach for Thyroid Detection: Feature Enhanced AdaBoost Optimization with Max Voting
    Bhende, Deepali
    Sakarkar, Gopal
    Jaiswal, Ambika
    Khandar, Punam
    Uparkar, Satyajit
    Agrawal, Lalit
    INTERNATIONAL JOURNAL OF ONLINE AND BIOMEDICAL ENGINEERING, 2024, 20 (14) : 71 - 84
  • [27] STUDY OF PERSONAL CREDIT RISK ASSESSMENT BASED ON SUPPORT VECTOR MACHINE ENSEMBLE
    Wu, Chong
    Guo, Yingjian
    Zhang, Xinying
    Xia, Han
    INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2010, 6 (05): : 2353 - 2360
  • [28] A soft voting ensemble learning approach for credit card fraud detection
    Mim, Mimusa Azim
    Majadi, Nazia
    Mazumder, Peal
    HELIYON, 2024, 10 (03)
  • [29] An unified framework for modeling credit cycles and systemic risk assessment
    Fortuna, Kamil
    Szwabinski, Janusz
    JOURNAL OF ECONOMIC INTERACTION AND COORDINATION, 2025, 20 (02) : 519 - 546
  • [30] Breast Cancer Prediction using Feature Selection and Ensemble Voting
    Nguyen, Quang H.
    Do, Trang T. T.
    Wang, Yijing
    Heng, Sin Swee
    Chen, Kelly
    Ang, Wei Hao Max
    Philip, Conceicao Edwin
    Singh, Misha
    Pham, Hung N.
    Nguyen, Binh P.
    Chua, Matthew C. H.
    PROCEEDINGS OF 2019 INTERNATIONAL CONFERENCE ON SYSTEM SCIENCE AND ENGINEERING (ICSSE), 2019, : 250 - 254