polito.it
Politecnico di Torino (logo)

Sparsification of deep neural networks via ternary quantization

Luca Dordoni

Sparsification of deep neural networks via ternary quantization.

Rel. Enrico Magli, Giulia Fracastoro, Sophie Fosson, Andrea Migliorati, Tiziano Bianchi. Politecnico di Torino, Corso di laurea magistrale in Physics Of Complex Systems (Fisica Dei Sistemi Complessi), 2023

[img]
Preview
PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (8MB) | Preview
Abstract:

In recent years, deep neural networks (DNNs) have achieved remarkable results in several machine learning tasks, especially in computer vision applications where they can often outperform human performance. Typically, deep models consist of tens of layers and millions of parameters, resulting in high memory consumption and computational overload. Conversely, the demand for smaller models is growing fast with the desire to deploy DNNs in environments with limited resources such as mobile devices. Methods to tackle this crucial challenge and obtain more compact networks while preserving performance rely on quantization or sparsification of the parameters. This thesis explores a combination of the two techniques, i.e. a sparsification method based on the ternarization of network parameters. Our approach is an extension of plain binarization of the parameters by adding a quantization interval centered around zero and of amplitude Δ such that parameters falling inside it are set to zero and removed from the network topology. Specifically, we use a ResNet-20 architecture to tackle the image recognition problem on the CIFAR 10 dataset. We show that increasing Δ as the training proceeds allows for sparsification rates over 90% while also ensuring improvement in classification accuracy over the binary framework. Despite the increased complexity required for implementing the ternarization scheme compared to a binary quantizer, we demonstrate that the remarkable sparsity rates translate to parameter distributions with significantly smaller average entropy (around 0.6 bits/symbol) and therefore highly compressible sources. Our findings show substantial improvements and have significant implications for the development of more efficient deep neural networks.

Relators: Enrico Magli, Giulia Fracastoro, Sophie Fosson, Andrea Migliorati, Tiziano Bianchi
Academic year: 2023/24
Publication type: Electronic
Number of Pages: 78
Subjects:
Corso di laurea: Corso di laurea magistrale in Physics Of Complex Systems (Fisica Dei Sistemi Complessi)
Classe di laurea: New organization > Master science > LM-44 - MATHEMATICAL MODELLING FOR ENGINEERING
Aziende collaboratrici: UNSPECIFIED
URI: http://webthesis.biblio.polito.it/id/eprint/29424
Modify record (reserved for operators) Modify record (reserved for operators)