polito.it
Politecnico di Torino (logo)

Credit Risk Assessment Using Machine Learning Techniques

Martina Scagliola

Credit Risk Assessment Using Machine Learning Techniques.

Rel. Patrizia Semeraro. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Matematica, 2022

[img]
Preview
PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (4MB) | Preview
[img] Archive (ZIP) (Documenti_allegati) - Altro
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (1MB)
Abstract:

Credit is a must in financial systems. For all financial institutions, whose role is to allocate credit, it is necessary to fully understand the risk behind it and to correctly decide who to give credit and who not. To do so, they make use of credit scoring, which is one of the most successful application of statistical and operational research modelling in finance. The aim of this thesis is to combine supervised and unsupervised machine learning models to predict the probability of default of a set of individuals who asked a loan to a bank, and to correctly classify them according to their individual propensity to default. In Chapter 1 we introduce the concepts of credit risk and credit scoring, and then we formally describe two mixture models: Bernoulli and Poisson mixture model. Chapter 2 illustrates the fundamental concepts behind machine learning and contains the theoretical description of some supervised learning models (logistic regression, support vector machine, K-nearest neighbors, random forest and AdaBoost classifier) and some clustering methods. In Chapter 3 and Chapter 4 we perform a credit score analysis on a public data set which simulate the real data set of a bank. In particular, we test the validity of the integration of unsupervised and supervised machine learning techniques by comparing individual models and cluster-based models performances. Chapter 3 contains the preprocessing part of the analysis, while the following Chapter focuses on the application of the machine learning models and the evaluation of their performances according to a set of measures such as AUC, accuracy, F-score and type I and type II errors. In particular, it introduces the concept of expected misclassification cost, which give us an idea of the economic impact of models results.

Relatori: Patrizia Semeraro
Anno accademico: 2022/23
Tipo di pubblicazione: Elettronica
Numero di pagine: 86
Soggetti:
Corso di laurea: Corso di laurea magistrale in Ingegneria Matematica
Classe di laurea: Nuovo ordinamento > Laurea magistrale > LM-44 - MODELLISTICA MATEMATICO-FISICA PER L'INGEGNERIA
Aziende collaboratrici: NON SPECIFICATO
URI: http://webthesis.biblio.polito.it/id/eprint/24058
Modifica (riservato agli operatori) Modifica (riservato agli operatori)