Politecnico di Torino (logo)

FPGA-based Deep LearningInference Acceleration at the Edge

Andrea Casale

FPGA-based Deep LearningInference Acceleration at the Edge.

Rel. Mihai Teodor Lazarescu. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Elettronica (Electronic Engineering), 2021

PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (4MB) | Preview

Deep Neural Networks (DNNs) have become the most widely used computational model in the majority of Machine Learning (ML) applications due to the incredible level of accuracy achievable. This result is obtained at the cost of an elevated computational complexity and high memory demand both for training and inference processing, making DNN implementations on systems with a limited amount of resources and stringent energy consumption constrains a challenging task. To address this challenge, exploiting the large amount of parallelism exhibited by such networks represents the solution to optimize the execution of Deep Learning (DL) algorithms. This has motivated over time the development of dedicated accelerators based on different hardware platforms capable of making DNN inference processing at the edge efficient in terms of both latency and energy efficiency. In the context of low-power embedded applications, the development of application-tailored accelerators based on custom hardware combined with approximate computing methods is the optimal solution for DNN inference processing at the edge, because it allows computationally expensive neural networks to be transformed into smaller and sparse models. The acceptance of a moderate accuracy loss compared to the uncompressed model makes hardware implementation of complex DNNs feasible and results in significant performance improvement. In this scenario, the Field Programmable Gate Arrays (FPGAs) represent one of the most promising solutions due to the high architectural flexibility and the ability to be reconfigured multiple times, allowing to support the application of a wide range of DNN model-based optimization algorithms and the development of highly parallel computing paradigms with energy efficient data-flows to achieve high performance. It suggests that for the purpose to improve the efficiency of complex DL algorithms implementation on resource-constrained systems, at least regarding to the inference processing, FPGA-based accelerators are best suited to the characteristics of DNNs. This thesis work addresses the problem of mapping complex DL algorithms on FPGAs efficiently by analyzing three types of neural network models, which are DNNs and Convolutional Neural Networks (CNNs), concentrating on maximizing performance and accuracy, and Spiking Neural Networks (SNNs), used mainly for low-power applications. Thus, another important aspect to consider for optimizing DNN inference processing consists to choose the type of neuron model best tailored to the case at hand.

Relators: Mihai Teodor Lazarescu
Academic year: 2020/21
Publication type: Electronic
Number of Pages: 100
Corso di laurea: Corso di laurea magistrale in Ingegneria Elettronica (Electronic Engineering)
Classe di laurea: New organization > Master science > LM-29 - ELECTRONIC ENGINEERING
Aziende collaboratrici: Politecnico di Torino
URI: http://webthesis.biblio.polito.it/id/eprint/17925
Modify record (reserved for operators) Modify record (reserved for operators)