Farzad Nikfam
Fast Adversarial Training for Deep Neural Networks.
Rel. Maurizio Martina, Muhammad Shafique. Politecnico di Torino, Corso di laurea magistrale in Mechatronic Engineering (Ingegneria Meccatronica), 2020
|
PDF (Tesi_di_laurea)
- Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives. Download (4MB) | Preview |
Abstract: |
Thesis topic focuses on Machine Learning from the software point of view, nowadays one of the research route for the management of large databases. Machine Learning is already widely present in our daily lives and we can find it, for example, both in the anti-spam filter of electronic mail, and in facial recognition of cameras, in the automatic corrector of smartphones, or in weather forecasts, etc. The aim of the thesis is to review algorithms written in Python language for models robust to adversarial attacks and try to apply to them fast training techniques to improve computational time. The word “fast training” refers to a code able to reach the skill to distinguish and divide a large database's data in a reasonable time according to the learning rules given by the programmer. The main criticality of fast training consists in being able to find a quite fast algorithm but as well accurate: too much accuracy may require learning times that are too long to be acceptable, while a high convergence speed could lead to wrong results or even worse, do not converge, but diverge. Instead “adversarial training” means a code that can be robust against data modified on purpose that can not be distinguished by a human, but can have very negative effects on a Deep Neural Network (DNN) model, for example an attack can modify some pixels of an image without any real difference for a human eye but completely misclassified by the DNN model. The first steps were to study the basic concepts of the Machine Learning, then going on to compose simple codes in Matlab, where the fluidity allows a faster learning, to finally review and write more complex algorithms in Python, which language flexibility allows various options on them. The main libraries used in python for Machine Learning are TensorFlow (provided by Google) and Pytorch (provided by Facebook). In this thesis there is a focus on TensorFlow that allows to concentrate on the problem using a very high level language that optimizes the computational effort of the machine. The principal model used for this purpose is the DNN, that is a virtual representation and simplification of human brain. This model works cyclically: this means that it is trained on a Training dataset of images and then tested on a Validation dataset. Due to computational limits and to the huge dimensions of the datasets, it can take from a few hours to several days to perform a training session, therefore the goal of this thesis is to speed up the training time. The general method to improve the performances of the training is the fine tuning of hyperparameters, like as: learning rate, momentum, weight decay, batch size, etc. Learning rate is the most important hyperparameter and by modifying its shape and value during the training we can obtain a robust model to adversarial attack up to 2 times faster than normal training. The tests were performed on 2 different datasets (CIFAR10, CIFAR100) and with various shapes to obtain the best results with a “trial and error” approach. |
---|---|
Relatori: | Maurizio Martina, Muhammad Shafique |
Anno accademico: | 2019/20 |
Tipo di pubblicazione: | Elettronica |
Numero di pagine: | 76 |
Soggetti: | |
Corso di laurea: | Corso di laurea magistrale in Mechatronic Engineering (Ingegneria Meccatronica) |
Classe di laurea: | Nuovo ordinamento > Laurea magistrale > LM-25 - INGEGNERIA DELL'AUTOMAZIONE |
Aziende collaboratrici: | NON SPECIFICATO |
URI: | http://webthesis.biblio.polito.it/id/eprint/14386 |
Modifica (riservato agli operatori) |