Hybrid Deep Reinforcement Learning-based Collision Avoidance Algorithm for a Ground Robot in Indoor Environments

Francesco Melis

Hybrid Deep Reinforcement Learning-based Collision Avoidance Algorithm for a Ground Robot in Indoor Environments.

Rel. Elisa Capello, Hyeongjun Park. Politecnico di Torino, Corso di laurea magistrale in Mechatronic Engineering (Ingegneria Meccatronica), 2021

Preview

PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.
Download (22MB) | Preview

Abstract:	The rapid development of Artificial Intelligence (AI) is revolutionizing an increasing number of fields and industries. The exploration of another planet through autonomous rovers could be considered the most fascinating application of such technologies. These robots need efficient and robust autonomous guidance and navigation techniques to make decisions and avoid obstacles in challenging and partially unknown environments. Machine Learning is a type of AI, and Deep Reinforcement Learning is one of the most recent and promising techniques to face this challenge among its branches. It combines the framework of the Reinforcement Learning approaches, where an agent learns a policy that maps states into actions by interacting with an environment and obtaining a numerical reward depending on its behaviour, with the approximation ability of the Deep Neural Networks. Inspired by these considerations, this thesis focuses on the development of a collision avoidance algorithm based on Deep Reinforcement Learning applied to a ground robot equipped with a depth camera. A Neural Network, that represents the policy of the agent, has been trained to map a set of inputs obtained from simulated odometry and camera data into linear and angular velocity of the robot. The training has been performed using Proximal Policy Optimization (PPO), that is a state-of-the-art policy learning algorithm, and a multi-stage approach, consisting in training the agent in simulated scenarios characterized by incremental complexity, suitably designed. To increase the performance of the control algorithm, a hybrid control framework has been implemented to switch between the PPO stochastic policy and a deterministic policy able to find a time-convenient path to reach the target in absence of obstacles. The deterministic policy has been obtained by training a Neural Network using the Deep Deterministic Policy Gradient (DDPG) and the switching process is based on robot’s sensors measurements of the environment. The algorithm has been tested through MATLAB simulations in several different and challenging scenarios, showing good performance in avoiding static obstacles. Finally, the algorithm has been verified on a Gazebo simulator, revealing acceptable performance dealing with complex environments in a more realistic framework.
Relatori:	Elisa Capello, Hyeongjun Park
Anno accademico:	2020/21
Tipo di pubblicazione:	Elettronica
Numero di pagine:	67
Soggetti:
Corso di laurea:	Corso di laurea magistrale in Mechatronic Engineering (Ingegneria Meccatronica)
Classe di laurea:	Nuovo ordinamento > Laurea magistrale > LM-25 - INGEGNERIA DELL'AUTOMAZIONE
Aziende collaboratrici:	NON SPECIFICATO
URI:	http://webthesis.biblio.polito.it/id/eprint/18029

Modifica (riservato agli operatori)