Explainable Reinforcement Learning for Risk Mitigation in a human-robot collaboration scenario

Alessandro Iucci

Explainable Reinforcement Learning for Risk Mitigation in a human-robot collaboration scenario.

Rel. Paolo Garza. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2021

Preview

PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.
Download (6MB) | Preview

Abstract

Reinforcement Learning (RL) algorithms are highly popular in the robotics field because the can solve complex control problems, learn from dynamic environments and generate the optimal outcome. Explainability for all Machine Learning (ML)-based algorithms including RL is gaining importance because of the increasing complexity of the models, which makes them more accurate but at the same time less transparent. The need for explainability increases even more in Human-Robot Collaboration (HRC) scenarios where safety is an important aspect to be guaranteed. This work focuses on the application of two explainability techniques, “Reward Decomposition” and “Autonomous Policy Explanation”, on a RL algorithm which is the core of a risk mitigation module for robots’ operation in an automated warehouse scenario, a HRC environment where human and robots work together without harming one another.

The first technique used is “Reward decomposition” which gives an insight on the factors that impacted the robot’s choice by decomposing the reward function into sub-functions, each considering a specific aspect of the robot’s state, and using a graphical type of explanation