polito.it
Politecnico di Torino (logo)

Bad Teaching in Machine Unlearning with Similarity-based Sampling

Claudio Savelli

Bad Teaching in Machine Unlearning with Similarity-based Sampling.

Rel. Flavio Giobergia, Elena Maria Baralis. Politecnico di Torino, Corso di laurea magistrale in Data Science And Engineering, 2024

[img]
Preview
PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (8MB) | Preview
Abstract:

This thesis explores the domain of machine unlearning, specifically focusing on developing and evaluating different algorithms. These methodologies enable machine learning models to selectively erase the influence of specific data from the training in compliance with privacy regulations such as the GDPR's "Right to be Forgotten." The research introduces a novel unlearning framework designed to optimize unlearning effectiveness without compromising the model’s performance. A significant contribution of this work is the development of a new unlearning method that surpasses existing algorithms in terms of the balance between data forgetting and model utility. This method also allows for a dynamic evaluation of the trade-off between forgetting and retention, ensuring optimal model retraining considering potential constraints. Three new datasets—MUCelebA, Modified MUFAC, and MUCIFAR-100—are developed to rigorously test and benchmark the proposed unlearning technique with the others available in the literature. These datasets are built to address different aspects of unlearning, providing a diverse testing environment that mirrors real-world applications. The thesis presents a comprehensive evaluation of these datasets, showcasing the versatility and robustness of the proposed methods under various scenarios. The findings of this work offer a comprehensive overview of machine unlearning and a solid framework for continuing research in this field. This enhances the possibility of adhering to these regulations without prohibitive computational costs. The methodology and datasets introduced set a new benchmark in this field for future research.

Relatori: Flavio Giobergia, Elena Maria Baralis
Anno accademico: 2023/24
Tipo di pubblicazione: Elettronica
Numero di pagine: 50
Soggetti:
Corso di laurea: Corso di laurea magistrale in Data Science And Engineering
Classe di laurea: Nuovo ordinamento > Laurea magistrale > LM-32 - INGEGNERIA INFORMATICA
Aziende collaboratrici: Politecnico di Torino
URI: http://webthesis.biblio.polito.it/id/eprint/31851
Modifica (riservato agli operatori) Modifica (riservato agli operatori)