Paolo Alberto
Privacy Preserving Data Mining: a distributed approach to data anonymization.
Rel. Paolo Garza. Politecnico di Torino, Corso di laurea magistrale in Data Science And Engineering, 2021
|
Preview |
PDF (Tesi_di_laurea)
- Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives. Download (6MB) | Preview |
Abstract
With an increasing number of real world applications of Data Science algorithms, the concept of data privacy and protection of sensible information has become an increasingly debated topic. This is especially true when we look at the direction taken by European legislation when it comes to data protection of EU citizens. While there are already some software solutions available on the market for algorithms that perform data anonymization, none of them are well suited for Big Data applications. In this project we propose a distributed computing approach to data anonymization, leveraging the Apache Spark engine in order to perform privacy preserving algorithms inside of a large-scale data processing environment.
We will also explore the topic of data classification, with the goal of predicting the appropriate level of privacy when new data gets uploaded to the system
Relatori
Anno Accademico
Tipo di pubblicazione
Numero di pagine
Corso di laurea
Classe di laurea
Aziende collaboratrici
URI
![]() |
Modifica (riservato agli operatori) |
