Two-stage consumer credit risk modelling using heterogeneous ensemble learning
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216275%3A25410%2F19%3A39914868" target="_blank" >RIV/00216275:25410/19:39914868 - isvavai.cz</a>
Výsledek na webu
<a href="https://www.sciencedirect.com/science/article/pii/S0167923619300028" target="_blank" >https://www.sciencedirect.com/science/article/pii/S0167923619300028</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1016/j.dss.2019.01.002" target="_blank" >10.1016/j.dss.2019.01.002</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Two-stage consumer credit risk modelling using heterogeneous ensemble learning
Popis výsledku v původním jazyce
Modelling consumer credit risk is a crucial task for banks and non-bank financial institutions to support decision-making on granting loans. To model the overall credit risk of a consumer loan in terms of expected loss (EL), three key credit risk parameters must be estimated: probability of default (PD), loss given default (LGD) and exposure at default (EAD). Research to date has tended to model these parameters separately. Moreover, a neglected area in the field of LGD/EAD modelling is the application of ensemble learning, which by benefitting from diverse base learners reduces the over-fitting problem and enables modelling diverse risk profiles of defaulted loans. To overcome these problems, this paper proposes a two-stage credit risk model that integrates (1) class-imbalanced ensemble learning for predicting PD (credit scoring), and (2) an EAD prediction using a regression ensemble. Furthermore, multi-objective evolutionary feature selection is used to minimize both the misclassification cost (root mean squared error) of the PD and EAD models and the number of attributes necessary for modelling. For this task, we propose a misclassification cost metric suitable for consumer loans with fixed exposure because it combines opportunity cost and LGD. We show that the proposed credit risk model is not only more effective than single-stage credit risk models but also outperforms state-of-the-art methods used to model credit risk in terms of prediction and economic performance.
Název v anglickém jazyce
Two-stage consumer credit risk modelling using heterogeneous ensemble learning
Popis výsledku anglicky
Modelling consumer credit risk is a crucial task for banks and non-bank financial institutions to support decision-making on granting loans. To model the overall credit risk of a consumer loan in terms of expected loss (EL), three key credit risk parameters must be estimated: probability of default (PD), loss given default (LGD) and exposure at default (EAD). Research to date has tended to model these parameters separately. Moreover, a neglected area in the field of LGD/EAD modelling is the application of ensemble learning, which by benefitting from diverse base learners reduces the over-fitting problem and enables modelling diverse risk profiles of defaulted loans. To overcome these problems, this paper proposes a two-stage credit risk model that integrates (1) class-imbalanced ensemble learning for predicting PD (credit scoring), and (2) an EAD prediction using a regression ensemble. Furthermore, multi-objective evolutionary feature selection is used to minimize both the misclassification cost (root mean squared error) of the PD and EAD models and the number of attributes necessary for modelling. For this task, we propose a misclassification cost metric suitable for consumer loans with fixed exposure because it combines opportunity cost and LGD. We show that the proposed credit risk model is not only more effective than single-stage credit risk models but also outperforms state-of-the-art methods used to model credit risk in terms of prediction and economic performance.
Klasifikace
Druh
J<sub>imp</sub> - Článek v periodiku v databázi Web of Science
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Návaznosti výsledku
Projekt
<a href="/cs/project/GA16-19590S" target="_blank" >GA16-19590S: Analýza témat a sentimentu vícenásobných textových zdrojů pro finanční rozhodování podniků</a><br>
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Ostatní
Rok uplatnění
2019
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název periodika
Decision Support Systems
ISSN
0167-9236
e-ISSN
—
Svazek periodika
118
Číslo periodika v rámci svazku
March
Stát vydavatele periodika
NL - Nizozemsko
Počet stran výsledku
13
Strana od-do
33-45
Kód UT WoS článku
000461535200004
EID výsledku v databázi Scopus
2-s2.0-85059804281