Safe Exploration with Safety Layer and reward shaping
Alessia Basler
Safe Exploration with Safety Layer and reward shaping.
Rel. Manuela Battipede. Politecnico di Torino, Master of science program in Aerospace Engineering, 2021
|
Preview |
PDF (Tesi_di_laurea)
- Thesis
Licence: Creative Commons Attribution Non-commercial No Derivatives. Download (5MB) | Preview |
Abstract
The purpose of this Master Thesis is to investigate and improve one of the state-of-the-art Safe Reinforcement Learning algorithms. The studied algorithm consists in the application of a Safety Layer to classical Reinforcement Learning algorithms in order to accomplish a Safe Exploration during learning phases, that would open up the doors of real-world training to intelligent agents. Safety Layer algorithm shows good performances in environments where the danger is located on the edges, but worsens when used in environments where the hazards permeate the space in an heterogenous way. To improve the performances in such peculiar situations, reward shaping has been introduced, in order to reinforce the safety action of Safety Layer.
In the first chapters of the thesis an introduction to Artificial Intelligence, Deep Neural Networks, classic and Deep Reinforcement Learning will be presented
Relators
Publication type
URI
![]() |
Modify record (reserved for operators) |
