Politecnico di Torino (logo)

Designing and Evaluating Mapping of CNN layers on an edge-CGRA

Nicolo' Carpentieri

Designing and Evaluating Mapping of CNN layers on an edge-CGRA.

Rel. Daniele Jahier Pagliari, Maurizio Martina, Alessio Burrello, Davide Schiavone, Juan Sapriza. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2024

PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (6MB) | Preview

Convolutional Neural Networks (CNNs) play a crucial role in image processing and computer vision. They are extensively used for tasks like image enhancement, filtering, and feature detection. Consequently, it is essential to efficiently implement convolution operations on hardware architectures to obtain superior performance when accelerating CNNs. The primary aim of this thesis is to explore different convolution implementations on Coarse-Grained Reconfigurable Arrays (CGRAs). CGRAs represent a departure from conventional computing architectures, offering enhanced flexibility and energy efficiency. In contrast to Application-Specific Integrated Circuits (ASICs), known for their efficiency but lack of flexibility, and Graphics Processing Units (GPUs), which are versatile but consume high power, CGRAs strike a balance by enabling instruction-level programming. This approach reduces the complexity and latency associated with configuring Field-Programmable Gate Arrays (FPGAs) at the bit level, leading to a harmonious blend of performance, space optimization, and energy efficiency. CGRAs serve as energy-efficient and high-speed accelerators in IoT processors and embedded systems to enhance the performance of demanding computational operations. These architectures comprise a grid of Processing Elements (PEs) arranged in a two-dimensional layout. Connectivity among these elements enables seamless data transfer between adjacent PEs, thereby streamlining arithmetic computations, particularly accumulation tasks commonly found in convolution operations. This thesis conducts a thorough assessment of software development approaches for Convolutional Neural Networks (CNNs), with a specific emphasis on enhancing energy efficiency and minimizing latency. The study compares the conventional direct convolution method with the IM2COL technique, which reorganizes input data into a column format to facilitate matrix multiplication-based convolution operations. While IM2COL has the potential to enhance computational efficiency, it also leads to increased memory demands due to data replication. To tackle the energy consumption related to data movement, the thesis explores two dataflow strategies: weight and output stationary. The weight stationary method aims to optimize weight reuse within PEs to reduce energy consumption, whereas the output stationary approach concentrates on mitigating the energy overhead associated with managing partial sums by keeping them localized to the register file (RF) of the PEs. Additionally, the thesis exploits three types of parallelism to boost throughput and diminish latency: parallelism in output channels, parallelism in input channels, and parallelism in filter spatial dimensions. The findings of this study indicate that the most effective approach is the parallelization of the filter, utilizing weight stationarity for optimizing both energy efficiency and latency. This technique operates at a speed that is 12.5 times faster than that of a RISC-V processor, achieving a performance of 0.80 MAC/CYCLE. Additionally, it consumes 7.2 times less energy, with an energy consumption of 46 µJ, compared to 334 µJ when using a RISC-V processor.

Relators: Daniele Jahier Pagliari, Maurizio Martina, Alessio Burrello, Davide Schiavone, Juan Sapriza
Academic year: 2023/24
Publication type: Electronic
Number of Pages: 82
Corso di laurea: Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering)
Classe di laurea: New organization > Master science > LM-32 - COMPUTER SYSTEMS ENGINEERING
Ente in cotutela: Ecole Polytechnique Federale de Lausanne (EPFL) (SVIZZERA)
URI: http://webthesis.biblio.polito.it/id/eprint/30827
Modify record (reserved for operators) Modify record (reserved for operators)