Politecnico di Torino (logo)

Model Predictive Control and Reinforcement Learning for Quadrotor Agile Flight Control

Giacomo Dematteis

Model Predictive Control and Reinforcement Learning for Quadrotor Agile Flight Control.

Rel. Luciano Lavagno. Politecnico di Torino, Corso di laurea magistrale in Mechatronic Engineering (Ingegneria Meccatronica), 2022

PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (5MB) | Preview

The work presented in this master thesis project is related to the control aspects of fast and agile drone trajectory tracking. Aerodynamic forces make quadrotors trajectory tracking at high-speed extremely challenging. At high speeds these complex effects have a major impact in performance loss, measured in terms of large position tracking errors. Model Predictive Control (MPC) together with Reinforcement Learning (RL) is used to tackle the problem. We propose to use RL to offline tune the MPC formulation using the data obtained from the system. MPC is an optimal control method with a well-established theory that exploits a dynamic model of the platform and provides constraint satisfaction. RL methods allow solving control problems with minimum prior knowledge about the task. RL automatically trains the decision-making process via trial and error and maximize the performance through a given reward function. In our approach, RL is used for adjusting the MPC parameters, through a Q-learning technique by exploiting MPC as a function approximator. Indeed, unlike Deep Neural Networks (DNN), MPC as a function approximator for RL, can explicitly achieve constraints satisfaction, stability, and safety. Therefore, the goal is to combine the advantages of both methods: the ability of MPC to safely control a physical robot through well-established knowledge and the power of RL to learn complex policies using experienced data. The resulting control framework can handle large-scale inputs, reduce human intervention in design and tuning, and eventually achieve optimal control performance. The method is verified through precise and extensive simulation environment. This work is the result of a seven months period at the Norwegian University of Science and Technology (NTNU), Trondheim, Norway under the group of Professor Sebastein Gros, supervised by Postdoctoral fellow Dirk Reinhardt.

Relators: Luciano Lavagno
Academic year: 2022/23
Publication type: Electronic
Number of Pages: 52
Corso di laurea: Corso di laurea magistrale in Mechatronic Engineering (Ingegneria Meccatronica)
Classe di laurea: New organization > Master science > LM-25 - AUTOMATION ENGINEERING
Ente in cotutela: Norwegian University of Science and Technology (NORVEGIA)
Aziende collaboratrici: Norwegian University of Science and Tech
URI: http://webthesis.biblio.polito.it/id/eprint/24663
Modify record (reserved for operators) Modify record (reserved for operators)