Safe Exploration with Safety Layer and reward shaping
Alessia Basler
Safe Exploration with Safety Layer and reward shaping.
Rel. Manuela Battipede. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Aerospaziale, 2021
|
Preview |
PDF (Tesi_di_laurea)
- Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives. Download (5MB) | Preview |
Abstract
The purpose of this Master Thesis is to investigate and improve one of the state-of-the-art Safe Reinforcement Learning algorithms. The studied algorithm consists in the application of a Safety Layer to classical Reinforcement Learning algorithms in order to accomplish a Safe Exploration during learning phases, that would open up the doors of real-world training to intelligent agents. Safety Layer algorithm shows good performances in environments where the danger is located on the edges, but worsens when used in environments where the hazards permeate the space in an heterogenous way. To improve the performances in such peculiar situations, reward shaping has been introduced, in order to reinforce the safety action of Safety Layer.
In the first chapters of the thesis an introduction to Artificial Intelligence, Deep Neural Networks, classic and Deep Reinforcement Learning will be presented
Relatori
Tipo di pubblicazione
URI
![]() |
Modifica (riservato agli operatori) |
