polito.it
Politecnico di Torino (logo)

A Pre-Processing Framework for Mitigating Representation Bias in Machine Learning Classification Algorithms

Annalisa Deiana

A Pre-Processing Framework for Mitigating Representation Bias in Machine Learning Classification Algorithms.

Rel. Francesco Della Santa. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Matematica, 2024

[img]
Preview
PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (3MB) | Preview
Abstract:

The use of Machine Learning (ML) algorithms in decision-making processes has significantly increased in recent years, providing alternatives to human decisions, which are frequently affected by bias. However, ML algorithms can also exhibit bias, leading to discrimination against individuals or groups based on sensitive attributes such as gender or race. This bias often arises from the imbalanced representation of demographic groups in training datasets. Mitigating representation bias during the training phase is crucial to ensure the fair application of ML algorithms in decision-making processes. This thesis presents a pre-processing framework designed to address representation bias by oversampling minority groups, thereby creating a balanced and fair dataset for model training. The proposed framework identifies skewed groups with lower imbalance ratios and employs the DBSCAN clustering algorithm to classify points as core, border, or noise. Subsequently, the SMOTE oversampling algorithm generates synthetic samples through interpolation between border points and border/core points until each group attains the highest imbalance ratio. The performance and fairness of the proposed method are evaluated using standard evaluation metrics and fairness measures. Experimental results indicate that the framework significantly improves fairness while maintaining a minimal loss in predictive performance compared to other existing methods.

Relatori: Francesco Della Santa
Anno accademico: 2023/24
Tipo di pubblicazione: Elettronica
Numero di pagine: 72
Soggetti:
Corso di laurea: Corso di laurea magistrale in Ingegneria Matematica
Classe di laurea: Nuovo ordinamento > Laurea magistrale > LM-44 - MODELLISTICA MATEMATICO-FISICA PER L'INGEGNERIA
Aziende collaboratrici: UTRECHT UNIVERSITY
URI: http://webthesis.biblio.polito.it/id/eprint/31589
Modifica (riservato agli operatori) Modifica (riservato agli operatori)