Luca Mannini
Distributed CNN Inference on FPGA-based DPU Clusters.
Rel. Corrado De Sio, Luca Sterpone, Federico Buccellato. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2026
|
Preview |
PDF (Tesi_di_laurea)
- Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives. Download (3MB) | Preview |
Abstract
Deep neural networks are increasingly deployed on edge platforms, but model growth often exceeds the compute and memory capacity of a single embedded accelerator. This thesis investigates model-parallel distributed inference across a cluster of AMD Xilinx Kria KV260 boards, each equipped with a Vitis AI DPU. The work focuses on a key practical challenge: when a quantized model is split into standalone components, the Vitis AI compiler may reorganize boundary operations, creating shape mismatches and offloading critical convolutions to CPU. To address this issue, we design and implement an XIR graph splitter with an output fix operator that preserves quantization consistency at expert boundaries and eliminates the boundary penalty.
We also provide an empirical characterization of compiler behavior through micro-model experiments, deriving safe split-point rules for DPU-friendly partitioning
Relatori
Anno Accademico
Tipo di pubblicazione
Numero di pagine
Corso di laurea
Classe di laurea
URI
![]() |
Modifica (riservato agli operatori) |
