Giacomo Fantino
Research and development of methods for the generation of synthetic data for the protection of privacy and the reduction of the risk of bias.
Rel. Antonio Vetro', Marco Rondina. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2024
|
PDF (Tesi_di_laurea)
- Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives. Download (1MB) | Preview |
Abstract: |
Synthetic data have become an increasingly important technique in various data-driven applications. By generating data based on the properties of real data, this approach offers several advantages, including data augmentation, anonymization, semi-supervised learning and balancing unbalanced learning contexts. The generation of tabular data brings additional challenges, such as the presence of very complex between different attributes and the presence of categorical values, the value of which incorporates a semantic meaning. As a result, much scientific research in recent years has been produced, creating methodologies applicable to many contexts and with very interesting results. In this thesis, I have analyzed the state of the art, experimented with some generators and have used them in applications such as Oversampling, Privacy and Fairness. In each application i performed a set of experiments, with the objective of evaluating the effectiveness, impact, and trade-offs of the synthetic data generators. |
---|---|
Relatori: | Antonio Vetro', Marco Rondina |
Anno accademico: | 2023/24 |
Tipo di pubblicazione: | Elettronica |
Numero di pagine: | 65 |
Soggetti: | |
Corso di laurea: | Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering) |
Classe di laurea: | Nuovo ordinamento > Laurea magistrale > LM-32 - INGEGNERIA INFORMATICA |
Aziende collaboratrici: | DATA Reply S.r.l. con Unico Socio |
URI: | http://webthesis.biblio.polito.it/id/eprint/31817 |
Modifica (riservato agli operatori) |