Design of a Flexible Hardware Accelerator for Ultra-Low Power Quantized Neural Networks based on Serial Multipliers

Mattia Carlo Petruzzellis

Design of a Flexible Hardware Accelerator for Ultra-Low Power Quantized Neural Networks based on Serial Multipliers.

Rel. Maurizio Martina, Guido Masera, Francesco Conti. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Elettronica (Electronic Engineering), 2019

Preview

PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.
Download (3MB) | Preview

Abstract:	Today, our society is experiencing a new revolution that goes under the name of Artificial Intelligence (AI). In particular, the part of AI that is increasingly gaining more and more attention is Deep Learning (DL), whose main idea is to use a Deep Neural Network (DNN) in order to let a machine learn through training and perform, through inference, several tasks, as if these were performed by a human being. Among these tasks we find autonomous driving, speech recognition, computer vision and many others. The reason why DL is taking off compared to other well known and documented Machine Learning (ML) algorithms is due to its capability to take advantage of huge amount of data. Indeed, whereas the latter have no considerable performance boost when increasing the available data over a certain threshold, the first can get huge benefits out of it the larger is the designed Neural Network (NN). Moreover, the digitization of our society and its transferring many human activities to the digital realm, created a mechanism on which DNN could thrive and get better and better, without the hustle of looking for data somewhere else. Depending on the specific application for which a DNN is developed, one may be pushed towards the adoption of different hardware platforms and architectures. The main operation that a DNN is required to perform is a multiply-and-accumulate (MAC) and being DL such a computational hungry solution, graphical processing units (GPUs) are often used due their highly parallel computational capabilities as well as their superior accuracy. Still, for applications where the energy consumption plays a major role, there has been a growing interest in the development of specialized hardware accelerators, either ASIC or FPGA-based. Specifically, this thesis focuses on the development of an Ultra-Low Power quantized Convolutional Neural Network (CNN) hardware accelerator, potentially applicable to Internet of Things (IoT) end-nodes for near-sensor analytics and which could be interfaced with the open source Hardware Processing Engine (HWPE) in the Pulp-platform developed by ETH Zurich. Indeed, the idea behind edge computing is to move the analytic part from data centers closer to sensors in order to tackle the power limitations that come with the amount of data that needs to be sent and the available bandwidth. By doing so, through inference one is able to extract the useful information on the spot and send only that to the data center, which is a way less expensive solution than sending it all. The developed hardware accelerator works on a 8 bits parallelism for the activations and on a 4 bits parallelism for the weights and performs multiplications serially to be able to achieve a higher maximum operating frequency compared to the parallel counterpart while keeping the throughput, i.e. the number of operations per unit cycle unchanged. This, however, comes at the cost of an increase in area of x2.91 compared to the parallel solution. Furthermore, using quantization will inevitably lead to a loss in accuracy compared to the full precision but this can still be considered acceptable when the loss percentage is kept below 5%, as it usually happens for the above mentioned parallelism. All this provided a rather flexible solution as will be discussed further in the thesis work..
Relatori:	Maurizio Martina, Guido Masera, Francesco Conti
Anno accademico:	2018/19
Tipo di pubblicazione:	Elettronica
Numero di pagine:	103
Soggetti:
Corso di laurea:	Corso di laurea magistrale in Ingegneria Elettronica (Electronic Engineering)
Classe di laurea:	Nuovo ordinamento > Laurea magistrale > LM-29 - INGEGNERIA ELETTRONICA
Aziende collaboratrici:	NON SPECIFICATO
URI:	http://webthesis.biblio.polito.it/id/eprint/11015

Modifica (riservato agli operatori)