polito.it
Politecnico di Torino (logo)

On the viability and effectiveness of Reinforcement Learning techniques for Wave-Energy converter control.

Leonardo Gambarelli

On the viability and effectiveness of Reinforcement Learning techniques for Wave-Energy converter control.

Rel. Giuliana Mattiazzo, Edoardo Pasta, Sergej Antonello Sirigu. Politecnico di Torino, Corso di laurea magistrale in Mechatronic Engineering (Ingegneria Meccatronica), 2022

[img]
Preview
PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (4MB) | Preview
Abstract:

An application of Reinforcement Learning (RL) techniques, a type of Machine Learning algorithm inspired by the human learning (through positive and negative rewards), is investigated for the problem of wave-energy convertes (WECs) control. RL techniques seems particularly suited for the WECs' control problem due to their model-free nature and their intrinsic definition as a Markov Decision Process, which should be a good way to cast the WECs' situations. Wave energy is an interesting source given its high energy potential, which is almost entirely untouched by men (it is estimated that being able to harvest 0.2% of the energy of the seas on Earth, would provide energy worldwide). The main difficulty in making sea energy viable is its control problem: an optimal control is needed for its economical viability, but usually this optimal control is formulated by exploiting control oriented model for the WECs, however those models are affected by strong uncertainties that could results in a suboptimal control. For this reasons, in this thesis it has been tried to explore the possibilities of applying data driven control techniques such as RL to formulate the WEC control problem. At first, WECs are defined and the most used classifications of those devices are presented. Then the theory of dynamical systems is introduced, along with the traditional control techniques used for the WECs and some other more recent techniques, After that, the theory of Reinforcement Learning is presented, with the most common declinations of this algorithms, Q-learning, Sarsa, Least Square Policy Iteration, and others. With all the theoretical basis settled in, a first MATLAb algorithm is developed, a Q-learning algorithm with some simplifications: it is an offline algorithm that computes all possible scenarios before the learning and uses those data for the actual learning, simulating the system as a simplified dynamical model, a mass-spring-damper system (MSD), with ideal monochromatic exciting waves. The same is then done for a Sarsa algorithm, showing that both algorithms are theoretically feasible, with the WECs' control problem that is easily casted as an RL problem through the use of states and actions. Moreover it is showed that due to the totally deterministic interaction, Q-learning is faster than Sarsa for converging. From this first algorithms, ideal assumptions are gradually removed: a Simulink scheme for simlulating the system in real time is developed, and so a version of a Q-learning online algorithm, which sees the power received in real time in order to choose his next actions. The final step is that of using an actual WEC model, more specifically a Pendulum WEC (PeWEC) model is used, defined in the MATLAb/Simulink environments with its state-space represantion matrices. For all those algorithms, a parameter sensitivity study has been made: the codes have been run multiple times, each time with different settings of the learning parameters, so that at the end it was possible to see the effect of the variation of each of the parameters with respect to the learning expirience. The results are analyzed, showing that RL algorithms can deal with the control problem of WECs and they have also some interesting points of strength, such as the automatic recognition of variations in the system and the consequent adaptation. Finally, it is showed that also irregular waves can be translated into equivalent ideal monochromatic waves, and in this way RL methods are able to recognize them as actual states.

Relatori: Giuliana Mattiazzo, Edoardo Pasta, Sergej Antonello Sirigu
Anno accademico: 2022/23
Tipo di pubblicazione: Elettronica
Numero di pagine: 126
Soggetti:
Corso di laurea: Corso di laurea magistrale in Mechatronic Engineering (Ingegneria Meccatronica)
Classe di laurea: Nuovo ordinamento > Laurea magistrale > LM-25 - INGEGNERIA DELL'AUTOMAZIONE
Aziende collaboratrici: NON SPECIFICATO
URI: http://webthesis.biblio.polito.it/id/eprint/24664
Modifica (riservato agli operatori) Modifica (riservato agli operatori)