Yuliang Chen
Efficient Mixed-Precision Quantization of Deep Neural Networks for Edge Applications.
Rel. Mario Roberto Casu. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Elettronica (Electronic Engineering), 2024
|
Preview |
PDF (Tesi_di_laurea)
- Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives. Download (7MB) | Preview |
Abstract
Efficient Mixed-Precision Quantization of Deep Neural Networks for Edge Applications Thesis Title: Efficient Mixed-Precision Quantization of Deep Neural Networks for Edge Applications This thesis explores the impact of mixed-precision quantization (MPQ) on deploying deep neural networks (DNNs) in edge applications. The research aims to reduce computational complexity during inference on embedded devices by simplifying scaling factors to powers of two, enabling efficient shift operations in place of multiplications. This approach reduces computational costs and energy consumption but can narrow the quantization range, potentially affecting model performance. The study involved training various models, including MobileNetV1, MobileNetV2, an auto-encoder, EfficientNet, ResNet, and a CNN for a keyword spotting (KWS) task.
While all models performed well under MPQ, only the auto-encoder and CNN for KWS maintained good performance under flat quantization, where the same quantizer is applied across all layers
Relatori
Anno Accademico
Tipo di pubblicazione
Numero di pagine
Corso di laurea
Classe di laurea
URI
![]() |
Modifica (riservato agli operatori) |
