Structured Pruning for Efficient Convolutional Networks

Giovanni Calleris

Structured Pruning for Efficient Convolutional Networks.

Rel. Giuseppe Bruno Averta, Barbara Caputo, Fabio Cermelli, Claudia Cuttano. Politecnico di Torino, Corso di laurea magistrale in Data Science And Engineering, 2023

Preview

PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.
Download (913kB) | Preview

Abstract:	In recent years, the size of Deep Learning models has increased. This trend has been allowed by the advancements in this research area and the development of the hardware targeted for its use. It has led to better results overall. Bigger models, having more parameters, can adapt better to the given task than their smaller counterparts. However, small models are helpful in many real-life scenarios, if not mandatory. They are needed for energy consumption constraints, speed requirements, and hardware limitations. The lottery ticket hypothesis states that a trained model may contain a sub-network whose performance would be similar to the one of the whole model. In this work, we try to find this sub-network through pruning, a technique for removing parameters, starting from a pre-trained model. This work proposes a method for applying structured pruning on Deep Learning models in Computer Vision settings, specifically in image classification. The intent is to keep the original model's performance unchanged while reducing the number of parameters, and thus the model size, in a structured way: it does not remove single parameters but whole channels, avoiding making the network sparse and hard to optimise on hardware. Usually, pruning methods base their decision upon a chosen criterion; the technique introduced in this work is instead trained to select which parameters to prune and which to keep, thus not introducing a bias in the selection. This is done by the OutputAnalyser class that wraps the model, substitutes the 2D convolutional layers with a PruningConv2D module, and, at the end of the pruning phase, reinserts the now pruned convolutional layers. Each PruningConv2D contains one of the original convolutional layers and a model, which we will call the pruning-model. At training time, the pruning-models' predictions will select which of the original layer parameters to keep and discard. We use their last prediction at test time to decide which channels to keep. Moreover, we avoid using other external and hard-to-balance losses to do this. The accuracy score obtained on the CIFAR-10 dataset is 92.84 %: removing 33.23 % of the parameters and reducing the FLOPs by 38.23 %, the error increased to 7.16 %, starting from 6.47 % of the original ResNet-110 model on this dataset. We used a specific set of pruning-models for this method. Still, this approach opens up new possibilities since it allows the user to change their structure: they can be deepened and enlarged with no additional computational cost for the final pruned model. This methodology could also be extended to different kinds of layers and other Deep Learning tasks.
Relatori:	Giuseppe Bruno Averta, Barbara Caputo, Fabio Cermelli, Claudia Cuttano
Anno accademico:	2023/24
Tipo di pubblicazione:	Elettronica
Numero di pagine:	45
Soggetti:
Corso di laurea:	Corso di laurea magistrale in Data Science And Engineering
Classe di laurea:	Nuovo ordinamento > Laurea magistrale > LM-32 - INGEGNERIA INFORMATICA
Aziende collaboratrici:	NON SPECIFICATO
URI:	http://webthesis.biblio.polito.it/id/eprint/28492

Modifica (riservato agli operatori)