Sullivan Hué, Christophe Hurlin, Christophe Pérignon and Sébastien Saurin
Additional contact information
Sullivan Hué: Aix-Marseille University - Aix-Marseille School of Economics
Christophe Hurlin: University of Orleans
Christophe Pérignon: HEC Paris
Sébastien Saurin: University of Orleans, Laboratoire d'économie d'Orléans, Students
Abstract: In credit scoring, machine learning models are known to outperform standard parametric models. As they condition access to credit, banking supervisors and internal model validation teams need to monitor their predictive performance and to identify the features with the highest impact on performance. To facilitate this, we introduce the XPER methodology to decompose a performance metric (e.g., AUC, R^2) into specific contributions associated with the various features of a classification or regression model. XPER is theoretically grounded on Shapley values and is both model-agnostic and performance metric-agnostic. Furthermore, it can be implemented either at the model level or at the individual level. Using a novel dataset of car loans, we decompose the AUC of a machine-learning model trained to forecast the default probability of loan applicants. We show that a small number of features can explain a surprisingly large part of the model performance. Furthermore, we find that the features that contribute the most to the predictive performance of the model may not be the ones that contribute the most to individual forecasts (SHAP). We also show how XPER can be used to deal with heterogeneity issues and significantly boost out-of-sample performance.
Keywords: Machine learning; Explainability; Performance metric; Shapley value
73 pages, November 22, 2022
Full text files
papers.cfm?abstract_id=4280563 HTML file Full text
Questions (including download problems) about the papers in this series should be directed to Antoine Haldemann ()
Report other problems with accessing this service to Sune Karlsson ().
RePEc:ebg:heccah:1463This page generated on 2024-09-13 22:19:53.