
Niccolo Cacioli
Optimizing YOLO Inference for Hardware Constraints Through Quantization Techniques.
Rel. Luciano Lavagno, Teodoro Urso. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2025
![]() |
PDF (Tesi_di_laurea)
- Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives. Download (11MB) |
Abstract: |
This thesis investigates the application of YOLO (You Only Look Once) models for object detection tasks, with a particular focus on the quantization of such models to enable efficient deployment on edge devices and resource-constrained hardware platforms. Model quantization plays a critical role in reducing memory footprint and computational cost while aiming to preserve the accuracy and robustness of the original floating-point networks. The work focuses on integrating a complete training and evaluation pipeline, including data pre-processing compliant with widely adopted standards (e.g. YOLO) and the integration of automated tools for ground truth visualization and validation. Various training strategies were explored to enhance model performance, includ- ing hyperparameter tuning, architectural modifications, and data augmentation techniques. A central contribution of the work is the design of a modular quantization workflow, leveraging tools compatible with ONNX and tailored for deployment with hardware-accelerated inference platforms. The methodology includes model export, transformation, optimization, and performance validation of the quantized networks. Experimental results, obtained from both standard benchmarks and domain- specific datasets, demonstrate that the proposed approach achieves a favorable trade-off between model compactness and detection accuracy. These findings support the feasibility of adopting quantized YOLO architectures in real-world, real-time applications across diverse environments. The workflow proposed, along with the solid results obtained by this work, offers a valid stepping stone for future improvements and optimizations. |
---|---|
Relatori: | Luciano Lavagno, Teodoro Urso |
Anno accademico: | 2024/25 |
Tipo di pubblicazione: | Elettronica |
Numero di pagine: | 93 |
Soggetti: | |
Corso di laurea: | Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering) |
Classe di laurea: | Nuovo ordinamento > Laurea magistrale > LM-32 - INGEGNERIA INFORMATICA |
Aziende collaboratrici: | NON SPECIFICATO |
URI: | http://webthesis.biblio.polito.it/id/eprint/36358 |
![]() |
Modifica (riservato agli operatori) |