Politecnico di Torino (logo)

Design of an edge-oriented vector accelerator based on RISC-V "V" extension

Francesca Sica

Design of an edge-oriented vector accelerator based on RISC-V "V" extension.

Rel. Guido Masera, Maurizio Martina, Michele Caon, Walid Walid. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Elettronica (Electronic Engineering), 2022

PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (2MB) | Preview

In the last decade, the ever-increasing diffusion of machine learning algorithms for digital signal processing has drastically changed the hardware processing requirements for edge devices. These systems have to manage a large amount of data while often still responding to some events in real-time. To elaborate the information acquired by these systems, a suitable paradigm can be edge computing: instead of directly sending raw data to remote central servers, this can be partially processed close to where it is collected so that a smaller amount of elaborated data is sent to central systems, reducing the response time, the network-overloading and the overall power budget required. To manage such a high quantity of data, it is important to choose efficient architectures exploiting parallel computing: vector processors have demonstrated to be a promising solution. Among the advantageous aspects, there is the reduction of the overhead caused by instruction fetch from memory (Von Neumann Bottleneck), which is typical of scalar processors when dealing with data-driven workloads: a single vector instruction can be used to process a very large vector. Moreover, vector processors are characterized by high flexibility since they are programmable: to change functionalities for different application targets, it is sufficient to modify and recompile the code. This cannot be done on custom hardware accelerators that arise for a single specific application. Certainly, the versatility of vector processors inherently brings some area and power consumption overhead. Considering the most recent ISAs, some allow having "hardware-agnostic" software, such that the same code can run on vector architectures with different parallelism. Among these, RISC-V is one of the most promising: its vector extension ("V") allows to evince the physical length of vectors and elements at run time. So, the same code can be used on processors featuring different vector register sizes, allowing high performance and great versatility in different application domains. In this thesis, a scalable and highly configurable vector processor based on the RISC-V "V" extension is designed and implemented. Most of its components (i.e., vector register file and processing elements) is made up of a set of identical lanes, each processing different elements of a vector independently from the others. The advantage of this structure is that, depending on the application and power consumption target, priority can be either given to performance using a higher number of lanes and arithmetic operators, or to the area and power having a less performing yet smaller processor. The performance of the processor is evaluated on a representative workload of machine learning algorithms: matrix convolution is used as case study, considering a 4x4 matrix and a 2x2 filter with 32-bit elements. With a 256-bit vector register and 2 lanes, results show a latency of 412 clock cycles to terminate the program, with a throughput of 128 bits/cycle; from synthesis results, the total cell area is about 1137755 um^2, and the clock frequency is 510 MHz. The same code is run with different hardware configuration: as expected, throughput and area scale almost linearly with the number of lanes. In conclusion, the designed vector processor is potentially very versatile: it implements a subset of standard instructions used in most use cases; then, it is scalable and highly configurable, being able to choose the number of resources and consequently optimizing the throughput.

Relators: Guido Masera, Maurizio Martina, Michele Caon, Walid Walid
Academic year: 2022/23
Publication type: Electronic
Number of Pages: 119
Corso di laurea: Corso di laurea magistrale in Ingegneria Elettronica (Electronic Engineering)
Classe di laurea: New organization > Master science > LM-29 - ELECTRONIC ENGINEERING
Aziende collaboratrici: Politecnico di Torino
URI: http://webthesis.biblio.polito.it/id/eprint/24591
Modify record (reserved for operators) Modify record (reserved for operators)