Towards Scalable and Energy-Efficient AI/ML Hardware Accelerators
Lorenzo Ruotolo
Towards Scalable and Energy-Efficient AI/ML Hardware Accelerators.
Rel. Daniele Jahier Pagliari. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2024
|
Preview |
PDF (Tesi_di_laurea)
- Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives. Download (8MB) | Preview |
Abstract
Entering the sub-nanometer era of transistors, energy efficiency has become essential for high-performance hardware architectures, especially in high-end data center accelerators tasked with complex workloads like Convolutional Neural Networks (CNNs) and other Deep Learning (DL) applications. Following this trend, new architectures are emerging. One of these is Soft-SIMD Functional Units, a specific type of SIMD functional unit. This architecture supports the flexible use of low bit-width data types (as low as 3 bits), improving parallel performance in both uniformly and heterogeneously quantized (UQ and HQ) CNNs compared to hardware-based counterparts (hard-SIMD). This design also utilizes a shift-add-based Canonical Signed Digit (CSD) multiplication, which further reduces area (59.9%), and energy consumption (50.1%) compared to hard-SIMD, with only minor performance degradation.
Another key architecture central to this work is Very Wide Registers (VWRs), which help mitigate the high energy cost of frequent and repetitive memory accesses in DL workloads by organizing registers in a very wide but extremely shallow (single-bit data line) single-ported memory array
Tipo di pubblicazione
URI
![]() |
Modifica (riservato agli operatori) |
