polito.it
Politecnico di Torino (logo)

From Gated-SSA to Out-of-Order Dataflow Circuits

Giacomo Sansone

From Gated-SSA to Out-of-Order Dataflow Circuits.

Rel. Mariagrazia Graziano. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2025

[img]
Preview
PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (7MB) | Preview
Abstract:

In the past 10 years, both companies and universities started pushing to have working and reliable High Level Synthesis tools. This is a mandatory step for the spread and success of FPGAs across many industries: people without a hardware background are supposed to be able to rely on these tools, thus adopting a framework which allows them to deploy hardware starting from what they are more comfortable with, software. Two main paradigms arose: static-HLS instantiates a datapath to perform computation and a controller to steer the data and the execution flow. While this approach is intuitive and pragmatic, it lacks flexibility, since the scheduling of operations is decided at compile-time. This results from the purely static nature of the approach. On the contrary, in dynamic-HLS [Josipović, FPGA’18] the control is distributed among components, which exchange tokens using a valid/ready handshake. This mechanism allows components to start as soon as the operands are available, guaranteeing correctness by design. This thesis is about Dynamatic, a dynamic-HLS tool designed and developed in EPFL since 2018. Many studies have been condacted using such a compiler, trying to optimise the resulting area and timing, e.g. [Josipović, FPGA’20]. The first part of the thesis is about implementation, to get familiar with the compiler infrastructure and be ready to develop in its context (the compiler is MLIR-based). Many published papers on this topic propose solutions that are not integrated into the compiler itself. To start addressing such an issue, the papers [Elakhras, FPL’22] and [Elakhras, FPGA’23] have been chosen to be implemented. Several reasons justify this choice: they have significant improvements in terms of improved timing and area; the way they work opens to a new set of optimisations which cannot be done otherwise; they are non-trivial in the requirements. This work results in around 3000 lines of code, and the benchmarks show a ~30% execution time improvement, along with a ~10% area reduction. Building on these circuit implementations, we address the problem of task-level parallelism. As the circuits are designed, only one set of inputs can be processed at one time. This is clearly necessary to maintain correctness, but it is also sub-optimal in terms of buffer occupancy and long-latency operations. This is even more relevant if we put ourselves in the context of a streaming environment, as an FPGA can be when used in a server. A common solution to this problem is going out of order. However, this automatically implies no correct ordering and, as a consequence, no correctness. To mitigate the problem, the out-of-order infrastructure from [Elakrhas, FPGA’24] can be introduced. Since area-wise this is an overhead, we aim at limiting its adoption. Our solution is to develop a mathematical framework within Dynamatic that statically analyzes circuits and determines when out-of-order execution is beneficial. This framework takes into account the behaviours of all the different dataflow components, but also the topology of the resulting circuits. Overall, it aims at modelling the throughput of a channel (how many tokens in a unit of time) for any channel in the steady state of the circuit. Using the compiler itself as a validation tool, we can state that the current model has a ~5% error with what we can measure during simulation.

Relatori: Mariagrazia Graziano
Anno accademico: 2024/25
Tipo di pubblicazione: Elettronica
Numero di pagine: 113
Soggetti:
Corso di laurea: Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering)
Classe di laurea: Nuovo ordinamento > Laurea magistrale > LM-32 - INGEGNERIA INFORMATICA
Ente in cotutela: KUNGLIGA TEKNISKA HOGSKOLAN (ROYAL INSTITUTE OF TECHNOLOGY) - EECS (SVEZIA)
Aziende collaboratrici: EPFL - ECOLE POLYTECHNIQUE FEDERALE DE LAUSANNE
URI: http://webthesis.biblio.polito.it/id/eprint/35508
Modifica (riservato agli operatori) Modifica (riservato agli operatori)