Development and Real-Time Implementation of Reinforcement Learning Based Controller for quadrotor UAVs Applications

Luca Montecchio

Development and Real-Time Implementation of Reinforcement Learning Based Controller for quadrotor UAVs Applications.

Rel. Alessandro Rizzo, Kimon Valavanis. Politecnico di Torino, Corso di laurea magistrale in Mechatronic Engineering (Ingegneria Meccatronica), 2024

PDF (Tesi_di_laurea) - Tesi
Accesso riservato a: Solo utenti staff fino al 31 Ottobre 2025 (data di embargo).
Licenza: Creative Commons Attribution Non-commercial No Derivatives.
Download (21MB)

Abstract:	Unmanned Aerial vehicles (UAVs) are today of widespread use around the globe for a plethora of different tasks that generally would require a pilot onboard the vehicle to drive the helicopter or the airplane for both civilian and military purposes giving rise to the cost both in financial and risk terms. The purpose of this thesis is to develop, implement and test on physical hardware a Reinforcement Learning (RL) based PID controller. RL exploits a Deep Deterministic Policy Gradient (DDPG) algorithm, which is an off-policy actor-critic method. The PID approach is performed through fine tuning and parameter estimation of the controller inner-loop gains for three circular trajectories with different duration's and velocities. The controller has been developed in Matlab/Simulink. The flight control software used in parallel is PX4. The quadrotor's model has firstly been developed in a Matlab/Simulink environment for training and performing Software-In-The-Loop (SITL) tests. The quadrotor model has been developed based on the PX4 software formulation and drone configuration. After the training phase, the controller has been tested in a more realistic environment to perform Hardware-In-The-Loop (HITL) simulations by means of the software PX4 and its built-in simulated environment jMAVSim exploiting the compatibility between the aforementioned and Simulink. Finally the experimental tests have been performed in an outdoor environment subject to different kind and uncontrollable disturbances to better evaluate the reliability of the controller to adapt the gains in real time. One of the best achievements is the complete implementation of the code included of the RL-agent into Simulink for the HITL simulations and on the Pixhawk 2.1 hardware board during for the real flights without support of additional hardware. Furthermore the number of neurons of the agent neural network have been reduced without worsening the performances. Performance evaluation and comparison studies are detailed between the manually tuned, fine tuned and estimated gains parameters approaches for the SITL, HITL and experimental tests. The drone well reacted to the noise, both simulated and real provided by the environment adapting the gains to counteract disturbances such as wind or ground effects. From this preliminary work, with this novel implementation through Matlab/Simulink and PX4 to upload a PID RL-based controller on a quadrotor UAV, future tests can be supported and developed following this study for other control techniques.
Relatori:	Alessandro Rizzo, Kimon Valavanis
Anno accademico:	2024/25
Tipo di pubblicazione:	Elettronica
Numero di pagine:	153
Soggetti:
Corso di laurea:	Corso di laurea magistrale in Mechatronic Engineering (Ingegneria Meccatronica)
Classe di laurea:	Nuovo ordinamento > Laurea magistrale > LM-25 - INGEGNERIA DELL'AUTOMAZIONE
Ente in cotutela:	University of Denver (STATI UNITI D'AMERICA)
Aziende collaboratrici:	University of Denver
URI:	http://webthesis.biblio.polito.it/id/eprint/33242

Modifica (riservato agli operatori)