Neural Networks hardware-specific optimization using different frameworks

Riccardo Bosio

Neural Networks hardware-specific optimization using different frameworks.

Rel. Paolo Garza. Politecnico di Torino, Corso di laurea magistrale in Data Science And Engineering, 2022

Preview

PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.
Download (3MB) | Preview

Abstract

The goal of this thesis is to implement different optimization pipelines in order to speed up Neural Networks inference time on specific hardwares. To reach this goal I have tested different frameworks: I compared the new inference times obtained applying the pipelines on three AllRead Machine Learning Technologies' models, solving different Computer Vision tasks. The explored frameworks are Apache TVM, OpenVINO, tensorRT and DeepStream and the target hardwares of the experiments are Intel i5 and i7 CPUs and NVIDIA Jetson Xavier GPU. Apache TVM is an open-source end-to-end machine learning compiler framework for CPUs and GPUs that enables optimization on any hardware backend.

OpenVINO is a toolkit designed for optimizing and delivering AI inference on Intel products: exploited optimization techniques include accuracy-aware quantization