polito.it
Politecnico di Torino (logo)

Optimizing YOLO Inference for Hardware Constraints Through Quantization Techniques

Niccolo Cacioli

Optimizing YOLO Inference for Hardware Constraints Through Quantization Techniques.

Rel. Luciano Lavagno, Teodoro Urso. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2025

[img] PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (11MB)
Abstract:

This thesis investigates the application of YOLO (You Only Look Once) models for object detection tasks, with a particular focus on the quantization of such models to enable efficient deployment on edge devices and resource-constrained hardware platforms. Model quantization plays a critical role in reducing memory footprint and computational cost while aiming to preserve the accuracy and robustness of the original floating-point networks. The work focuses on integrating a complete training and evaluation pipeline, including data pre-processing compliant with widely adopted standards (e.g. YOLO) and the integration of automated tools for ground truth visualization and validation. Various training strategies were explored to enhance model performance, includ- ing hyperparameter tuning, architectural modifications, and data augmentation techniques. A central contribution of the work is the design of a modular quantization workflow, leveraging tools compatible with ONNX and tailored for deployment with hardware-accelerated inference platforms. The methodology includes model export, transformation, optimization, and performance validation of the quantized networks. Experimental results, obtained from both standard benchmarks and domain- specific datasets, demonstrate that the proposed approach achieves a favorable trade-off between model compactness and detection accuracy. These findings support the feasibility of adopting quantized YOLO architectures in real-world, real-time applications across diverse environments. The workflow proposed, along with the solid results obtained by this work, offers a valid stepping stone for future improvements and optimizations.

Relatori: Luciano Lavagno, Teodoro Urso
Anno accademico: 2024/25
Tipo di pubblicazione: Elettronica
Numero di pagine: 93
Soggetti:
Corso di laurea: Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering)
Classe di laurea: Nuovo ordinamento > Laurea magistrale > LM-32 - INGEGNERIA INFORMATICA
Aziende collaboratrici: NON SPECIFICATO
URI: http://webthesis.biblio.polito.it/id/eprint/36358
Modifica (riservato agli operatori) Modifica (riservato agli operatori)