polito.it
Politecnico di Torino (logo)

Privacy Preserving Data Mining: a distributed approach to data anonymization

Paolo Alberto

Privacy Preserving Data Mining: a distributed approach to data anonymization.

Rel. Paolo Garza. Politecnico di Torino, Corso di laurea magistrale in Data Science And Engineering, 2021

[img]
Preview
PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (6MB) | Preview
Abstract:

With an increasing number of real world applications of Data Science algorithms, the concept of data privacy and protection of sensible information has become an increasingly debated topic. This is especially true when we look at the direction taken by European legislation when it comes to data protection of EU citizens. While there are already some software solutions available on the market for algorithms that perform data anonymization, none of them are well suited for Big Data applications. In this project we propose a distributed computing approach to data anonymization, leveraging the Apache Spark engine in order to perform privacy preserving algorithms inside of a large-scale data processing environment. We will also explore the topic of data classification, with the goal of predicting the appropriate level of privacy when new data gets uploaded to the system. The final product will be a software library, capable of querying multiple data sources and applying the required algorithms to the result. This computations will be performed with two main goals in mind: protecting sensible data of individuals, while at the same time preserving as much information as possible for analysts and data scientists to work with.

Relatori: Paolo Garza
Anno accademico: 2021/22
Tipo di pubblicazione: Elettronica
Numero di pagine: 79
Soggetti:
Corso di laurea: Corso di laurea magistrale in Data Science And Engineering
Classe di laurea: Nuovo ordinamento > Laurea magistrale > LM-32 - INGEGNERIA INFORMATICA
Aziende collaboratrici: Agile Lab S.r.l.
URI: http://webthesis.biblio.polito.it/id/eprint/21215
Modifica (riservato agli operatori) Modifica (riservato agli operatori)