Politecnico di Torino (logo)

An extended RT-level model foran NVIDIA GPU for safety-critical applications = An extended RT-level model for an NVIDIA GPU for safety-critical applications

Riccardo Faggiano

An extended RT-level model foran NVIDIA GPU for safety-critical applications = An extended RT-level model for an NVIDIA GPU for safety-critical applications.

Rel. Matteo Sonza Reorda, Josie Esteban Rodriguez Condia, Luca Sterpone. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Elettronica (Electronic Engineering), 2022

PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (1MB) | Preview

General Purpose Graphics Processing Unit (GPGPU) is a Graphic Processing Unit (GPU) that is programmed for purposes beyond graphics processing, such as performing computations typically conducted by a Central Processing Unit (CPU). Incorporating GPUs for general purposes enhances CPU architecture by accelerating portions of an application while the rest continues to run on the CPU, ultimately creating a faster, high performance application. They have been massively used in the last years in many fields, such as DSP, bioinformatics, machine learning, etc. They are becoming popular also in safety critical applications as, for example, autonomous and semi-autonomous vehicles. Even if they allows performances to be improved, these devices suffer from transient faults (those produced by radiation effects) which can cause misbehaviors not acceptable in critical application. The university of Massachusetts has developed a model for soft integer GPGPU optimized for a FPGA implementation which is called FlexGrip. This model can be useful to analyze the impact of the transient faults. By the way it has some restrictions which limit the possible analysis. So a new version of the model, FlexGripPlus, has been implemented to overcome these limits. FlexGripPlus is obviously improved with respect to the previous architecture but still some aspects make it far from real GPGPU. The main limitations is the that it has only one Streaming Multiprocessor (which is the core of the device including other elements such as the Scalar Processors, the Warp scheduler, register files and other memories). The aim of this work is to extend the FlexGripPro model in order to obtain an architecture which can run its applications on a generic number of Streaming Multiprocessors. The correctness of the execution must be tested through simulations and also a reliability analysis is conducted on the blocks which are massively modified to estimate the effects of the faults. The Block Scheduler, which is in charge of associating each block (group of threads) to a precise Streaming Multiprocessor, is the crucial point of the structure and it has been modified to work with many SMs. The “Round Robin” algorithm is chosen as technique to deal with the scheduling issue. Different controllers are connected with the Block Scheduler and, as consequence, they have been modified as well to allow the correct execution of the application. The presence of a generic number of Streaming Multiprocessor also has to do with all the memories (such as the global memory, the system memory and the constant memory) present in the design. The accesses to them must be accurately managed otherwise wrong data could be read or written. A system of arbiters has been added to control all these kind of requests. Different applications with different configurations of SMs, blocks and threads are simulated on Modelsim and the results are compared with the ones obtained with the original project. Transient faults are injected considering different programs and different configurations to analyze the effects on all the signals belonging to the Block Scheduler.

Relators: Matteo Sonza Reorda, Josie Esteban Rodriguez Condia, Luca Sterpone
Academic year: 2021/22
Publication type: Electronic
Number of Pages: 44
Corso di laurea: Corso di laurea magistrale in Ingegneria Elettronica (Electronic Engineering)
Classe di laurea: New organization > Master science > LM-29 - ELECTRONIC ENGINEERING
Aziende collaboratrici: Politecnico di Torino
URI: http://webthesis.biblio.polito.it/id/eprint/22765
Modify record (reserved for operators) Modify record (reserved for operators)