Politecnico di Torino (logo)

Towards Autonomous Robotic Spray Painting with Unsupervised Reinforcement Learning

Marco Prattico'

Towards Autonomous Robotic Spray Painting with Unsupervised Reinforcement Learning.

Rel. Tatiana Tommasi, Raffaello Camoriano, Gabriele Tiboni. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2023

PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (16MB) | Preview

Long-standing problems in robotics such as cleaning and spray painting require the generation of task-specific trajectories satisfying physical constraints. Accelerating the generation process by autonomously deriving robotic paths would reduce the manual effort currently required in such settings. Furthermore, path generation needs to adapt to the specific geometries of the target objects. These stringent requirements are usually met by designing ad-hoc heuristics for each specific object category. Reinforcement Learning (RL) has been successfully employed to tackle autonomous robotic tasks in the literature, from robotic manipulation to locomotion. However, RL algorithms can suffer from low generalization capabilities and low sample efficiency, i.e., a large number of agent-environment interactions are often needed for the algorithm to converge to successful, yet specific, policies. Unsupervised Reinforcement Learning (URL) has been proposed to speed up policy training by introducing a task-agnostic policy pre-training phase. In particular, pre-training does not involve task-specific reward signals. Instead, it exploits intrinsic motivation to encourage exploration of the underlying environment. This approach enables collecting transferable knowledge from the environment to be later used for fine-tuning the policy on a range of more specific downstream tasks. Recent works successfully employ URL to drive exploration, e.g., by maximizing the entropy of the state visitation distribution. Nevertheless, such algorithms have only been investigated in locomotion tasks or gridworld environments, despite their potential to be applied to coverage path planning (CPP) problems, which may benefit from state-entropy maximization. This thesis investigates the utilization of (U)RL for path-planning problems. In particular, we focus on robotic spray painting, a fundamental industrial manufacturing task which is strongly related to the class of CPP problems. In this context, we adopt the URL framework to speed up the training process of an object-specific policy by pre-training a single policy with a state-entropy maximization objective. Our experimental results characterize the impact of intrinsic motivation on the training process, comparing the final learning outcomes when training from scratch with those obtained by starting from a pre-trained URL policy.

Relators: Tatiana Tommasi, Raffaello Camoriano, Gabriele Tiboni
Academic year: 2023/24
Publication type: Electronic
Number of Pages: 95
Corso di laurea: Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering)
Classe di laurea: New organization > Master science > LM-32 - COMPUTER SYSTEMS ENGINEERING
Aziende collaboratrici: Politecnico di Torino
URI: http://webthesis.biblio.polito.it/id/eprint/29391
Modify record (reserved for operators) Modify record (reserved for operators)