Politecnico di Torino (logo)

On the Impact of Adversarial Training on Uncertainty Estimation and Uncertainty Targeted Attacks

Gilberto Manunza

On the Impact of Adversarial Training on Uncertainty Estimation and Uncertainty Targeted Attacks.

Rel. Barbara Caputo, Martin Jaggi, Matteo Matteucci. Politecnico di Torino, Corso di laurea magistrale in Data Science And Engineering, 2021

PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (7MB) | Preview

State of the art deep learning models, despite being successful in many applications,have the problem of being sensitive to small perturbations in the input data. These perturbations can be easily crafted by an adversary in order to attack a neural network to reduce its performances. This problem raises many reliability and security concerns about the deployment of deep learning models in real world applications. Adversarial training methods aim at improving the robustness of the model to such attacks, but many of them – including state of the art techniques like Projected Gradient Descent (PGD)– often lead to networks with lower unperturbed (clean) accuracy. Additionally some fast adversarial training techniques, e.g. the Fast Gradient Sign Method (FGSM), suffer from a problem called catastrophic overfitting, that happens when a model becomes very robust to a particular adversarial attack used during training, but can not generalize to others. Starting from these considerations and building on the notions of uncertainty estimation techniques the aforementioned problems will be tackled by introducing some adversarial attacks that instead of having the goal of fooling the network, aim at maximizing its uncertainty. These attacks will be extensively analyzed in various settings and under several uncertainty estimation frameworks, like Bayesian Neural Networks (BNNs), Monte Carlo Dropout (MCD) and the Gaussian Processes based method Deterministic Uncertainty Estimation (DUE). It will be shown, using the MNIST and the CIFAR-10 datasets, how this approach, implemented both in the image and in the latent space of a neural network, does not deteriorate the clean accuracy of the model, is robust to catastrophic overfitting and to PGD attacks.

Relators: Barbara Caputo, Martin Jaggi, Matteo Matteucci
Academic year: 2021/22
Publication type: Electronic
Number of Pages: 90
Corso di laurea: Corso di laurea magistrale in Data Science And Engineering
Classe di laurea: New organization > Master science > LM-32 - COMPUTER SYSTEMS ENGINEERING
Ente in cotutela: École Polytechnique Fédérale de Lausanne (SVIZZERA)
URI: http://webthesis.biblio.polito.it/id/eprint/20594
Modify record (reserved for operators) Modify record (reserved for operators)