Assessing Feasibility and Performance of Real-Time Semantic Segmentation in an Industrial IoT use case

Giacomo Zema

Assessing Feasibility and Performance of Real-Time Semantic Segmentation in an Industrial IoT use case.

Rel. Andrea Calimera. Politecnico di Torino, Corso di laurea magistrale in Data Science And Engineering, 2022

Preview

PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.
Download (19MB) | Preview

Abstract:	Semantic Segmentation is a computer vision task that consists in assigning a label to every pixel in an input image. There are many applications for Semantic Segmentation such as scene understanding in autonomous driving or robot vision, land cover classification of satellite images, segmentation of medical images, etc. Semantic Segmentation models are often very complex and require powerful hardware. This clashes with their usage in a resource-constrained environment such as edge devices in an IoT network. For this reason, the standard approach when deploying these models in an IoT application is to offload both training and inference to a remote server. While training on a server equipped with a GPU is a smart choice, offloading the inference phase can be rather inefficient. The issue regards the transfer of data from the edge to the server and back, which is very expensive in terms of energy and poses critical issues in terms of scalability, inference time and data security. Many applications of semantic segmentation on edge devices rely on collecting image data from cameras and processing them right after, therefore the inference phase has real-time requirements that cannot be fulfilled by offloading it to a cloud server. So, for such applications, a clear solution is to perform the inference on the edge devices. There are many cases in the literature of real-time semantic segmentation models, but virtually none of them achieve real-time latency (<33 ms) on less powerful hardware. The reason behind this is that these models are evaluated on really complex datasets, that favor bigger and more elaborate networks. In this thesis, we explore several optimizations that can be applied to a network to improve its inference latency while keeping an acceptable IoU score. The goal is to define an optimization pipeline that can be used to adapt the performance of a model to the requirements (IoU score and latency) of a specific task. We structured this pipeline according to the effort required by each step where with effort we intend both the time required by the operation and the difficulty of its implementation. The proposed optimizations are grouped into two main branches: Input Data and Model Topology. The first one consists in changing the size of the input images in order to speed up the inference, it doesn't need any knowledge of the model and therefore requires the least effort. The second branch demands a deeper understanding of the underlying network to identify important tuning knobs related to its different layers and functional blocks. In the experimental part, we selected a specific industrial IoT use case and a baseline network amongst the several models reviewed in the background chapter. We ran the experiments on a Raspberry Pi 4 and we studied in detail every optimization approach and their combinations, gaining meaningful insights that allowed us to propose an optimization framework that accounts for different levels of effort.
Relatori:	Andrea Calimera
Anno accademico:	2021/22
Tipo di pubblicazione:	Elettronica
Numero di pagine:	102
Soggetti:
Corso di laurea:	Corso di laurea magistrale in Data Science And Engineering
Classe di laurea:	Nuovo ordinamento > Laurea magistrale > LM-32 - INGEGNERIA INFORMATICA
Aziende collaboratrici:	NON SPECIFICATO
URI:	http://webthesis.biblio.polito.it/id/eprint/23584

Modifica (riservato agli operatori)