polito.it
Politecnico di Torino (logo)

Semantic Scene Segmentation for Indoor Robot Navigation

Daniele Cotrufo

Semantic Scene Segmentation for Indoor Robot Navigation.

Rel. Marcello Chiaberge. Politecnico di Torino, Corso di laurea magistrale in Mechatronic Engineering (Ingegneria Meccatronica), 2022

[img]
Preview
PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (5MB) | Preview
Abstract:

Scene Segmentation is an important component for robots which are required to navigate in an indoor environment. Obstacle avoidance is the task of detecting and avoiding obstacles and represents a hot topic for autonomous robots. To obtain a collision free motion, a robust module for obstacle detection is needed. The objective of this thesis is to make a robot able to navigate autonomously, relying only on visual perception, performing real-time segmentation of the indoor scene. In accordance with the state of the art, the proposed method is based on a Deep Learning model for Semantic Scene Segmentation. A Pyramid Scene Parsing (PSP) Net with a ResNet-34 as a backbone is chosen as a model to train. At first, the backbone has been pre-trained on ImageNet dataset, then, maintaining these weights fixed, the PSP Net is trained on the labeled dataset for semantic segmentation. In order to detect the viable part of the scene with high robustness, binary segmentation is chosen, so pixels al labeled as floor (1) or not floor (0). A 91% Intersection over Unit (IoU) score is achieved on the test set with this approach. Scene Segmentation is an important component for robots which are required to navigate in an indoor environment. Obstacle avoidance is the task of detecting and avoiding obstacles and represents a hot topic for autonomous robots. To obtain a collision free motion, a robust module for obstacle detection is needed. The objective of this thesis is to make a robot able to navigate autonomously, relying only on visual perception, performing real-time segmentation of the indoor scene. In accordance with the state of the art, the proposed method is based on a Deep Learning model for Semantic Scene Segmentation. A Pyramid Scene Parsing (PSP) Net with a ResNet-34 as a backbone is chosen as a model to train. At first, the backbone has been pre-trained on ImageNet dataset, then, maintaining these weights fixed, the PSP Net is trained on the labeled dataset for semantic segmentation. In order to detect the viable part of the scene with high robustness, binary segmentation is chosen, so pixels al labeled as floor (1) or not floor (0). A 91% Intersection over Unit (IoU) score is achieved on the test set with this approach. Once the image is correctly segmented, a post-processing is applied, in order to obtain a “pixel-goal” for navigation purposes. Navigation is performed through a proportional controller which links the steering angle with the coordinates of the pixel-goal. The linear velocity is handled by the navigation algorithm too. A ROS2 net with a segmentation and a navigation node is built. The model and the ROS2 net are then deployed on a Jetson AGX Xavier platform and the pipeline is tested on a Turtlebot3 robot. The coefficient of the proportional control is tuned directly with real world tries and multiple tests are performed to analyze the performance, mainly from a qualitative point of view. The best results are achieved in corridor scenarios, with the robot able to avoid obstacles along its path, while stays far enough from the walls. In situation with multiple objects with more complex shapes, such as people and chairs in an office, performances are worse, but the robot still often exploit obstacle avoidance correctly.

Relatori: Marcello Chiaberge
Anno accademico: 2021/22
Tipo di pubblicazione: Elettronica
Numero di pagine: 88
Soggetti:
Corso di laurea: Corso di laurea magistrale in Mechatronic Engineering (Ingegneria Meccatronica)
Classe di laurea: Nuovo ordinamento > Laurea magistrale > LM-25 - INGEGNERIA DELL'AUTOMAZIONE
Aziende collaboratrici: Politecnico di Torino - PIC4SER
URI: http://webthesis.biblio.polito.it/id/eprint/22861
Modifica (riservato agli operatori) Modifica (riservato agli operatori)