Politecnico di Torino (logo)

Machine Learning approach for credit score analysis

Margherita Doria

Machine Learning approach for credit score analysis.

Rel. Patrizia Semeraro. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Matematica, 2021

PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (2MB) | Preview

One of the core functions of a bank is the credit risk management and one of the most important tool for it is credit score analysis. The purpose of the latter is to improve the procedure assessing creditworthiness during the credit evaluation process of a client. The foremost objective is to discriminate the lending customers on the basis of their likelihood to default, that is to identify which customers have an high likelihood of default and thus could be insolvent, and instead which customers have a lower likelihood of defaulting, being more likely to pay their financial obligations. The most commonly used credit score analysis is logit regression analysis. In this study, we devote to use Machine Learning models in the prediction of private residential mortgage defaults. This study employs various single classification Machine Learning methodologies including Logistic Regression, K-Nearest Neighbors, Decision Trees, AdaBoost, XGBoost, Random Forest and Support Vector Machine. To further improve the predictive power, a Deep Learning technique, known as Convolutional Neural Network, widely applied to many image processing tasks, is applied to consumer credit scoring to see whether it still works well. Two different data samples were used for the study: a public data sample and a private data sample. The private sample for this study is provided by a private data set from a Swiss bank and an oversampling technique called SMOTE was implemented in order to treat the imbalance between classes for the response variable. The aim of this work is to examine which method from the mentioned set exhibits the best performance in default prediction with regards to the chosen model evaluation parameters. The results on private data showed that by modelling the Deep Learning approach, we achieve a significant improvement in the predictability performance of the model. On the other hand, the results on public data showed that the model with the best predictive ability is Adaptive Boosting.

Relators: Patrizia Semeraro
Academic year: 2021/22
Publication type: Electronic
Number of Pages: 104
Corso di laurea: Corso di laurea magistrale in Ingegneria Matematica
Classe di laurea: New organization > Master science > LM-44 - MATHEMATICAL MODELLING FOR ENGINEERING
Aziende collaboratrici: Credit Suisse AG
URI: http://webthesis.biblio.polito.it/id/eprint/19859
Modify record (reserved for operators) Modify record (reserved for operators)