Chen Yi Zhang
Learning capabilities of belief propagation based algorithms for sparse binary neural networks.
Rel. Luca Dall'Asta, Jean Barbier. Politecnico di Torino, Corso di laurea magistrale in Physics Of Complex Systems (Fisica Dei Sistemi Complessi), 2020
|
PDF (Tesi_di_laurea)
- Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives. Download (1MB) | Preview |
Abstract: |
With the growth in size and complexity of modern data sets, the computational and energetic costs of training large, fully connected neural networks, became an important issue. Therefore exploring the learning capabilities of sparse (possibly binarized) architectures, that possess many less degrees of freedom but empirically appear to generalize well, is an important research direction. In this work, the learning performance of belief propagation (BP) based algorithms, applied to simple two layer sparse neural networks with discrete synapses, is analysed. Initially, the framework in which the work is carried on is the so called teacher-student scenario, in which the learning problem corresponds to the inference of the weight values of a teacher network whose architecture is given. In this first part BP provides encouraging results, allowing to perfectly reconstruct the weights of the ‘teacher’ network that generated the training data, using only a small number of data points. Subsequently, the focus shifts to the mismatched setting with a discrepancy between the used architecture and the one of the ‘teacher’ network. The results are analysed in order to assess whether BP-based learning is relevant in this case as well, and if yes, to which extent of mismatch this is possible. |
---|---|
Relators: | Luca Dall'Asta, Jean Barbier |
Academic year: | 2020/21 |
Publication type: | Electronic |
Number of Pages: | 42 |
Subjects: | |
Corso di laurea: | Corso di laurea magistrale in Physics Of Complex Systems (Fisica Dei Sistemi Complessi) |
Classe di laurea: | New organization > Master science > LM-44 - MATHEMATICAL MODELLING FOR ENGINEERING |
Ente in cotutela: | Universite de Paris-Sud (Paris XI) (FRANCIA) |
Aziende collaboratrici: | ICTP |
URI: | http://webthesis.biblio.polito.it/id/eprint/15932 |
Modify record (reserved for operators) |