On the Impact of Adversarial Training on Uncertainty Estimation and Uncertainty Targeted Attacks

Gilberto Manunza

On the Impact of Adversarial Training on Uncertainty Estimation and Uncertainty Targeted Attacks.

Rel. Barbara Caputo, Martin Jaggi, Matteo Matteucci. Politecnico di Torino, Corso di laurea magistrale in Data Science And Engineering, 2021

Preview

PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.
Download (7MB) | Preview

Abstract:	State of the art deep learning models, despite being successful in many applications,have the problem of being sensitive to small perturbations in the input data. These perturbations can be easily crafted by an adversary in order to attack a neural network to reduce its performances. This problem raises many reliability and security concerns about the deployment of deep learning models in real world applications. Adversarial training methods aim at improving the robustness of the model to such attacks, but many of them – including state of the art techniques like Projected Gradient Descent (PGD)– often lead to networks with lower unperturbed (clean) accuracy. Additionally some fast adversarial training techniques, e.g. the Fast Gradient Sign Method (FGSM), suffer from a problem called catastrophic overfitting, that happens when a model becomes very robust to a particular adversarial attack used during training, but can not generalize to others. Starting from these considerations and building on the notions of uncertainty estimation techniques the aforementioned problems will be tackled by introducing some adversarial attacks that instead of having the goal of fooling the network, aim at maximizing its uncertainty. These attacks will be extensively analyzed in various settings and under several uncertainty estimation frameworks, like Bayesian Neural Networks (BNNs), Monte Carlo Dropout (MCD) and the Gaussian Processes based method Deterministic Uncertainty Estimation (DUE). It will be shown, using the MNIST and the CIFAR-10 datasets, how this approach, implemented both in the image and in the latent space of a neural network, does not deteriorate the clean accuracy of the model, is robust to catastrophic overfitting and to PGD attacks.
Relatori:	Barbara Caputo, Martin Jaggi, Matteo Matteucci
Anno accademico:	2021/22
Tipo di pubblicazione:	Elettronica
Numero di pagine:	90
Soggetti:
Corso di laurea:	Corso di laurea magistrale in Data Science And Engineering
Classe di laurea:	Nuovo ordinamento > Laurea magistrale > LM-32 - INGEGNERIA INFORMATICA
Ente in cotutela:	École Polytechnique Fédérale de Lausanne (SVIZZERA)
Aziende collaboratrici:	ECOLE POLYTECHNIQUE FEDERALE DE LAUSANNE
URI:	http://webthesis.biblio.polito.it/id/eprint/20594

Modifica (riservato agli operatori)