
Festa Shabani
Addressing Gender and Racial Bias in AI: A Data-Centric Approach for Fairer Outcomes.
Rel. Antonio Vetro', Luca Gilli, Simona Mazzarino. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2025
|
PDF (Tesi_di_laurea)
- Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives. Download (4MB) | Preview |
|
![]() |
Archive (ZIP) (Documenti_allegati)
- Altro
Licenza: Creative Commons Attribution Non-commercial No Derivatives. Download (5MB) |
Abstract: |
This thesis tackles the issue of bias in Artificial Intelligence (AI) systems, specifically focusing on the mitigation of gender and racial biases through a data-centric approach that seeks to achieve more equitable results. As AI systems become more prevalent in critical sectors such as healthcare, finance, and criminal justice, they risk unintentionally reinforcing and amplifying societal biases that are present in the data they learn from. To address this, this research explores two main techniques for bias mitigation: preprocessing bias correction methods using the AI Fairness 360 (AIF360) toolkit and synthetic data generation with ClearboxAI's Tabular Engine to augment underrepresented groups. Specifically, the Adult and Medical Expenditure datasets, which involve sensitive attributes such as sex and race, are used to demonstrate how bias manifests differently in socio-economic and healthcare domains. Various preprocessing methods, including Reweighing, Disparate Impact Remover, Learning Fair Representations, and Optimized Preprocessing, are applied to mitigate bias, while synthetic data is generated to balance demographic disparities. The effectiveness of these methods is evaluated based on fairness metrics like Statistical Parity Difference, Disparate Impact, Average Odds Difference, Equal Opportunity Difference, and Theil Index, alongside performance metrics such as Balanced Accuracy. The results highlight the potential of preprocessing bias mitigation techniques, especially synthetic data generation as a form of dataset augmentation, in reducing bias without significantly sacrificing model performance. This work contributes to the growing field of responsible AI by demonstrating how a data-centric approach can improve fairness in AI models, ensuring fairer outcomes across diverse groups. The findings have practical implications for AI deployment in sensitive applications, providing strategies to improve fairness and accountability in AI-driven decision-making systems. Future work may explore additional bias mitigation techniques, including in-processing and post-processing methods, and further investigate the role of synthetic data generation to reduce bias in real-world AI systems. |
---|---|
Relatori: | Antonio Vetro', Luca Gilli, Simona Mazzarino |
Anno accademico: | 2024/25 |
Tipo di pubblicazione: | Elettronica |
Numero di pagine: | 79 |
Soggetti: | |
Corso di laurea: | Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering) |
Classe di laurea: | Nuovo ordinamento > Laurea magistrale > LM-32 - INGEGNERIA INFORMATICA |
Aziende collaboratrici: | ClearBox AI Solutions S.R.L. |
URI: | http://webthesis.biblio.polito.it/id/eprint/35271 |
![]() |
Modifica (riservato agli operatori) |