Politecnico di Torino (logo)

Uncertainty modeling in deep learning. Variational inference for Bayesian neural networks

Giacomo Deodato

Uncertainty modeling in deep learning. Variational inference for Bayesian neural networks.

Rel. Elisa Ficarra. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2019

PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (8MB) | Preview

Over the last decades, deep learning models have rapidly gained popularity for their ability to achieve state-of-the-art performances in different inference settings. Deep neural networks have been applied to an increasing number of problems spanning different domains of application. Novel applications define a new set of requirements that transcend accurate predictions and depend on uncertainty measures. The aims of this study are to implement Bayesian neural networks and use the corresponding uncertainty estimates to perform predictions and dataset analysis. After an introduction to the concepts behind the Bayesian framework we study variational inference and investigate its advantages and limitations in approximating the posterior distribution of the weights of neural networks. In particular, we underline the importance of the choice of a good prior, we analyze performance and uncertainty of models using normal priors and scale mixture priors, and we discuss the need to scale the complexity term of the variational objective during the training of the model. Furthermore, we identify two main advantages in modeling the predictive uncertainty of deep neural networks performing classification tasks. The first is the possibility to discard highly uncertain predictions to be able to guarantee a higher accuracy of the remaining predictions. The second is the identification of unfamiliar patterns in the data that correspond to outliers in the model representation of the training data distribution. The results show that the two priors used lead to similar predictive distributions but different posteriors. In fact, the scale mixture prior induces a better regularization and sparsity of the weights. Moreover, we find that the convergence of the model parameters to a reasonable optima is highly correlated to the scaling of the Kullback–Leibler divergence by a factor equal to the dimensionality of the posterior, especially when the posterior cannot perfectly fit the prior distribution. Finally, the analysis of the predictive uncertainty shows that it is possible to isolate both wrong predictions and out-of-distribution input samples, that are corrupted observations or data belonging to different domains. In conclusion, our study highlights the opportunities and challenges of the application of Bayesian neural networks in the context of image analysis, and proposes some best practices to train such models employing variational inference.

Relators: Elisa Ficarra
Academic year: 2018/19
Publication type: Electronic
Number of Pages: 122
Corso di laurea: Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering)
Classe di laurea: New organization > Master science > LM-32 - COMPUTER SYSTEMS ENGINEERING
Ente in cotutela: EURECOM - Telecom Paris Tech (FRANCIA)
Aziende collaboratrici: UNSPECIFIED
URI: http://webthesis.biblio.polito.it/id/eprint/10920
Modify record (reserved for operators) Modify record (reserved for operators)