Politecnico di Torino (logo)

FPGA Implementation of a Deep Learning Inference Accelerator for Autonomous vehicles

Giuseppe Cesarano

FPGA Implementation of a Deep Learning Inference Accelerator for Autonomous vehicles.

Rel. Maurizio Martina. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Elettronica (Electronic Engineering), 2018

PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (6MB) | Preview

The thesis discusses the implementation of the NVIDIA Deep Learning Accelerator (NVDLA) with FPGA. The NVDLA is a special purpose accelerator of neural network architectures for Deep Learning Inference, developed by NVIDIA, and whose code has been released by the developers for free. First of all, an overview about deep learning and convolutional neural networks is given. Then, different categories of accelerators are introduced, providing examples of applications belonging to each of these categories, and analyzing their performances and applicability in the automotive field (GPUs, manycore architectures, neuromorphic devices and specific purpose accelerators have been considered). After having considered similarities and differences among the different accelerators, the NVDLA system is described, highlighting its modularity and configurability. Each single block is explained and an overview about how the system works is given. Then, the FPGA chosen by Magneti Marelli for the purpose of this thesis is introduced: the Zynq Ultrascale+, provided by Xilinx. At this point the integration between the NVDLA and the FPGA is described. In the second part of the thesis, the implementation of the NVDLA on the FPGA is analyzed step by step: first of all, an introduction to the tools used for the thesis is done (Vivado Simulator, Xilinx SDK, Petalinux), and then both hardware and software implementations is described. An analysis has been done about the resource utilization, identifying the percentage of resources of the FPGA used by the smallest version of the NVDLA. With this architecture, a very detailed timing analysis has also been performed, taking into account the critical path and the timing closure. Then, the functioning of the system has been verified at different frequencies: the performances have been evaluated, verifying the behavior of the NVDLA considering the different blocks separately and then running a complete neural network architecture, the AlexNet. A comparison with other architectures has been performed. Finally, different versions of the NVDLA are analyzed, obtained increasing the size of the different engines, and observing the rise of the FPGA’s resource utilization. A complete analysis about the power consumption has been carried out to better understand the performances and to compare the different versions.

Relators: Maurizio Martina
Academic year: 2018/19
Publication type: Electronic
Number of Pages: 106
Corso di laurea: Corso di laurea magistrale in Ingegneria Elettronica (Electronic Engineering)
Classe di laurea: New organization > Master science > LM-29 - ELECTRONIC ENGINEERING
Aziende collaboratrici: Magneti Marelli spa
URI: http://webthesis.biblio.polito.it/id/eprint/9033
Modify record (reserved for operators) Modify record (reserved for operators)