Uncertainty modeling in deep learning. Variational inference for Bayesian neural networks

Giacomo Deodato

Uncertainty modeling in deep learning. Variational inference for Bayesian neural networks.

Rel. Elisa Ficarra. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2019

Preview

PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.
Download (8MB) | Preview

Abstract:	Over the last decades, deep learning models have rapidly gained popularity for their ability to achieve state-of-the-art performances in different inference settings. Deep neural networks have been applied to an increasing number of problems spanning different domains of application. Novel applications define a new set of requirements that transcend accurate predictions and depend on uncertainty measures. The aims of this study are to implement Bayesian neural networks and use the corresponding uncertainty estimates to perform predictions and dataset analysis. After an introduction to the concepts behind the Bayesian framework we study variational inference and investigate its advantages and limitations in approximating the posterior distribution of the weights of neural networks. In particular, we underline the importance of the choice of a good prior, we analyze performance and uncertainty of models using normal priors and scale mixture priors, and we discuss the need to scale the complexity term of the variational objective during the training of the model. Furthermore, we identify two main advantages in modeling the predictive uncertainty of deep neural networks performing classification tasks. The first is the possibility to discard highly uncertain predictions to be able to guarantee a higher accuracy of the remaining predictions. The second is the identification of unfamiliar patterns in the data that correspond to outliers in the model representation of the training data distribution. The results show that the two priors used lead to similar predictive distributions but different posteriors. In fact, the scale mixture prior induces a better regularization and sparsity of the weights. Moreover, we find that the convergence of the model parameters to a reasonable optima is highly correlated to the scaling of the Kullback–Leibler divergence by a factor equal to the dimensionality of the posterior, especially when the posterior cannot perfectly fit the prior distribution. Finally, the analysis of the predictive uncertainty shows that it is possible to isolate both wrong predictions and out-of-distribution input samples, that are corrupted observations or data belonging to different domains. In conclusion, our study highlights the opportunities and challenges of the application of Bayesian neural networks in the context of image analysis, and proposes some best practices to train such models employing variational inference.
Relatori:	Elisa Ficarra
Anno accademico:	2018/19
Tipo di pubblicazione:	Elettronica
Numero di pagine:	122
Soggetti:
Corso di laurea:	Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering)
Classe di laurea:	Nuovo ordinamento > Laurea magistrale > LM-32 - INGEGNERIA INFORMATICA
Ente in cotutela:	EURECOM - Telecom Paris Tech (FRANCIA)
Aziende collaboratrici:	NON SPECIFICATO
URI:	http://webthesis.biblio.polito.it/id/eprint/10920

Modifica (riservato agli operatori)