Subagging for credit scoring models

被引:135
|
作者
Paleologo, Giuseppe [2 ]
Elisseeff, Andre [1 ]
Antonini, Gianluca [1 ]
机构
[1] IBM Res GmbH, Zurich Res Lab, CH-8803 Ruschlikon, Switzerland
[2] IBM Global Financing Serv, Armonk, NY USA
关键词
Risk analysis; Credit scoring; Classification; Decision Support Systems;
D O I
10.1016/j.ejor.2009.03.008
中图分类号
C93 [管理学];
学科分类号
12 ; 1201 ; 1202 ; 120202 ;
摘要
The logistic regression framework has been for long time the most used statistical method when assessing customer credit risk. Recently, a more pragmatic approach has been adopted, where the first issue is credit risk prediction, instead of explanation. In this context, several classification techniques have been shown to perform well on credit scoring, such as support vector machines among others. While the investigation of better classifiers is an important research topic, the specific methodology chosen in real world applications has to deal with the challenges arising from the real world data collected in the industry. Such data are often highly unbalanced, part of the information can be missing and some common hypotheses, such as the i.i.d. one. can be violated. In this paper we present a case study based on a sample of IBM Italian customers, which presents all the challenges mentioned above. The main objective is to build and validate robust models, able to handle missing information, class unbalancedness and non-iid data points. We define a missing data imputation method and propose the use of an ensemble classification technique, subagging, particularly suitable for highly unbalanced data, such as credit scoring data. Both the imputation and subagging steps are embedded in a customized cross-validation loop, which handles dependencies between different credit requests. The methodology has been applied using several classifiers (kernel support vector machines, nearest neighbors, decision trees, Adaboost) and their subagged versions. The use of subagging improves the performance of the base classifier and we will show that subagging decision trees achieve better performance, still keeping the model simple and reasonably interpretable. (C) 2009 Elsevier B.V. All rights reserved.
引用
收藏
页码:490 / 499
页数:10
相关论文
共 50 条
  • [21] Credit Risk Scoring with Bayesian Network Models
    Leong, Chee Kian
    COMPUTATIONAL ECONOMICS, 2016, 47 (03) : 423 - 446
  • [22] CREDIT SCORING MODELS IN ESTIMATING THE CREDIT WORTHINESS OF SMALL AND MEDIUM AND BIG ENTERPRISES
    Zenzerovic, Robert
    CROATIAN OPERATIONAL RESEARCH REVIEW, 2011, 2 (01) : 143 - 157
  • [23] Advantages of credit scoring models in loan applicants evaluation
    Mileris, Ricardas
    CHANGES IN SOCIAL AND BUSINESS ENVIRONMENT, PROCEEDINGS, 2007, : 170 - 173
  • [24] Data mining feature selection for credit scoring models
    Liu, Y
    Schumann, M
    JOURNAL OF THE OPERATIONAL RESEARCH SOCIETY, 2005, 56 (09) : 1099 - 1108
  • [25] How to Measure the Quality of Credit Scoring Models Discussion
    Podpiera, Richard
    FINANCE A UVER-CZECH JOURNAL OF ECONOMICS AND FINANCE, 2011, 61 (05): : 508 - 509
  • [26] Developing Unsupervised Learning Models for Credit Scoring Problem
    Tudor, Liviana N.
    VISION 2025: EDUCATION EXCELLENCE AND MANAGEMENT OF INNOVATIONS THROUGH SUSTAINABLE ECONOMIC COMPETITIVE ADVANTAGE, 2019, : 3588 - 3594
  • [27] A review of fuzzy logic applied to credit scoring models
    Gomez Jaramillo, Sebastian
    CUADERNO ACTIVA, 2012, (03): : 37 - 44
  • [28] Credit scoring using global and local statistical models
    Schwarz, A
    Arminger, G
    Classification - the Ubiquitous Challenge, 2005, : 442 - 449
  • [29] Decision Trees as Interpretable Bank Credit Scoring Models
    Szwabe, Andrzej
    Misiorek, Pawel
    BEYOND DATABASES, ARCHITECTURES AND STRUCTURES: FACING THE CHALLENGES OF DATA PROLIFERATION AND GROWING VARIETY, 2018, 928 : 207 - 219
  • [30] Cost of Explainability in AI: An Example with Credit Scoring Models
    Dessain, Jean
    Bentaleb, Nora
    Vinas, Fabien
    EXPLAINABLE ARTIFICIAL INTELLIGENCE, XAI 2023, PT I, 2023, 1901 : 498 - 516