Optimized VLSI architectures for efficient sparsity exploitation in Deep Learning

Matteo Pellassa, Michele Tomatis

Optimized VLSI architectures for efficient sparsity exploitation in Deep Learning.

Rel. Maurizio Martina. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Elettronica (Electronic Engineering), 2021

Preview	PDF (Tesi_di_laurea) - Tesi Licenza: Creative Commons Attribution Non-commercial No Derivatives. Download (4MB) \| Preview
	Archive (ZIP) (Documenti_allegati) - Altro Licenza: Creative Commons Attribution Non-commercial No Derivatives. Download (150MB)

Abstract

Artificial intelligence (AI) nowadays plays a predominant role in many areas including robotics, computer vision for medicine, autonomous driving and much more. However, this sector's algorithms are very sophisticated and also known to be both compute and memory-intensive. Techniques to improve efficiency, reducing the number of computations without losing accuracy, are becoming critical. This thesis work focuses on the convolutional layer of Convolutional Neural Networks (CNNs) and aims to improve efficiency by avoiding useless computations. Starting from SqueezeFlow architecture, which employs PT-OS-sparse dataflow to exploit sparsity in the kernel matrices, we develop an architecture able to exploit sparsity in the input matrices and so employs PT-KS-sparse dataflow.

A two-level memory hierarchy is also introduced to reduce latency and energy consumption in data retrieval