polito.it
Politecnico di Torino (logo)

Design and Evaluation of Reconfigurable Systolic Arrays for Neural Networks

Sergio Ivano Fierro

Design and Evaluation of Reconfigurable Systolic Arrays for Neural Networks.

Rel. Mario Roberto Casu, Edward Manca. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Elettronica (Electronic Engineering), 2025

[img]
Preview
PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (5MB) | Preview
Abstract:

In recent years, Deep Neural Networks (DNNs) have achieved unprecedented accuracy across a wide range of tasks. These gains, however, come with substantial increases in model complexity and per‑inference computational cost. As a result, deploying DNNs presents new challenges, and devising efficient methods to execute their computations has become a central concern in research. Modern DNN workloads comprise many nested loops, with Multiply‑and-Accumulate (MAC) operations dominating. In this context, Systolic Arrays (SAs) have emerged as an architecture that connects and coordinates large numbers of Processing Elements (PEs) operating in parallel. Usually, SAs are composed of PEs that communicate only with their direct neighbors. This neighbor-to-neighbor connectivity allows them to have low fan-out connections. Moreover, SAs supports dataflow organizations that enable operand reuse, and relax the back-pressure to the memory to feed the PEs with new values. Finally, their regular structure aligns well with lightweight control units, usually composed by counters. All these properties make them a good choice to compute the kernels of DNNs such as matrix multiplication and convolution. On the other hand, both the shape of the array — i.e., the number of elements along rows and columns — and the dataflow scheme the array is designed for — namely Output Stationary (OS), Weight Stationary (WS), or Input Stationary (IS) — have a significant impact on computational efficiency. Each combination of shape and dataflow determines a computation strategy that better fits a given algorithm and a different design point in the latency, silicon area, and power consumption analysis. The goal of this thesis is to explore and design reconfigurable SAs that support multiple shapes and/or multiple dataflows. To this end, I explored the design space given by SAs with different shapes and dataflows, over a class of selected DNN kernels - i.e. convolutions, linear, and attention layers. For each configuration I verified the correctness with RTL simulation tools. Moreover, I collected the number of clock cycles needed by the SA to complete the computation, and I synthesized them on a 28 nm digital library to also collect latency, silicon area, and power consumption results. Overall, these data allowed me to rank the various shapes and dataflows on an efficiency metric of throughput/W, and to select the most efficient SA configurations. Once the best configurations have been selected, I designed and verified a reconfigurable SA supporting more than one configuration in the same design. Since the number of configurations to support directly influence the overhead coming from the reconfigurability, I explored architectures that implements two/three configurations at most in the same SA. This design has been simulated and synthesized on the same 28 nm technology library to validate its efficiency with the same throughput/W metric. The study demonstrates the importance of architectural choices in the SA design process and proposes a path to have more efficient reconfigurable SAs that can optimally execute more than one DNN algorithm with the same SA structure. Looking forward, this approach may serve as a foundation to study and efficiently compute the algorithms of novel DNN layers, leveraging run-time reconfiguration.

Relatori: Mario Roberto Casu, Edward Manca
Anno accademico: 2025/26
Tipo di pubblicazione: Elettronica
Numero di pagine: 80
Soggetti:
Corso di laurea: Corso di laurea magistrale in Ingegneria Elettronica (Electronic Engineering)
Classe di laurea: Nuovo ordinamento > Laurea magistrale > LM-29 - INGEGNERIA ELETTRONICA
Aziende collaboratrici: NON SPECIFICATO
URI: http://webthesis.biblio.polito.it/id/eprint/38733
Modifica (riservato agli operatori) Modifica (riservato agli operatori)