Weights Compression for Efficient Convolutional Neural Networks Acceleration on FPGA

Giovanni Cascone

Weights Compression for Efficient Convolutional Neural Networks Acceleration on FPGA.

Rel. Luciano Lavagno, Giovanni Brignone, Roberto Bosio, Teodoro Urso. Politecnico di Torino, Corso di laurea magistrale in Mechatronic Engineering (Ingegneria Meccatronica), 2025

Preview

PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.
Download (2MB) | Preview

Abstract

Convolutional Neural Networks (CNNs) have significantly advanced image recognition and computer vision. Their growing size and complexity are driven by the need for higher accuracy and the ability to tackle more complex tasks. Larger networks can learn richer and abstract features at various levels, enabling them to recognize not only basic patterns (e.g., edges and textures), but also more complex structures like objects and faces, even in challenging conditions like varying lighting or cluttered environments. The CNNs are scalable, but bigger networks imply more memory and compute capacity. This often becomes a problem, mainly on FPGAs, which have stringent resource restrictions, and this work tries to tackle these constraints for effective implementation.

This thesis addresses the challenge of deploying large Convolutional Neural Networks, such as MobileNet or ResNet-50, on FPGAs, and to overcome the limited on-chip memory capacity, the network weights (which constitute the majority of the memory footprint) are first compressed offline using entropy-based techniques and stored in external DDR