A Flexible FPGA-based Neural Processing Unit Architecture

Arianna Palermo

A Flexible FPGA-based Neural Processing Unit Architecture.

Rel. Mario Roberto Casu, Diana Göhringer. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Elettronica (Electronic Engineering), 2024

Preview

PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.
Download (5MB) | Preview

Abstract

Deep Neural Networks (DNNs) exhibit varying performance requirements and energy constraints depending on the applications in which they are employed. Neural Processing Units (NPUs) are designed and optimized to execute network functions and applications efficiently. Today, the same application might need to operate with different data bit-widths to meet varying task requirements. A fixed bit-width accelerator would have limited advantages in accommodating these diverse bit-width needs. Therefore, it is beneficial to develop NPUs that, with the same internal structure, can support different types of quantized data. Recently, various precision-scalable MAC (Multiply-Accumulate) architectures optimized for neural networks have been proposed. This work presents a review of state-of-the-art precision-scalable MAC architectures and proposes a new solution that can function both as a MAC and as a standard parallel multiplier or adder.

For each operation, it supports various precisions: 16x16, 8x8, and 4x4