polito.it
Politecnico di Torino (logo)

Implementation of a novel control-based training algorithm for recurrent neural networks

Giovanni Catalano, Corrado Raiola

Implementation of a novel control-based training algorithm for recurrent neural networks.

Rel. Sophie Fosson, Vito Cerone, Simone Pirrera, Diego Regruto Tomalino. Politecnico di Torino, Corso di laurea magistrale in Mechatronic Engineering (Ingegneria Meccatronica), 2023

Abstract:

The primary objective of this master's thesis project is the software implementation and the test of the performances of a novel control-based optimization algorithm, referred to as feedback linearization controlled multiplier, applied to the training of a specific category of recurrent neural networks (RNNs) known as Output Error Neural Networks (NNOE). The final goal is to construct mathematical models from sequential data and capture temporal dependencies between input and output measurements. We implement the algorithm in both Python and MATLAB to assess various performance aspects, address critical issues, and evaluate the efficiency of GPU utilization. We experiment with various Python’s libraries such as PyTorch, TensorFlow, and NumPy, concluding that PyTorch provides the best performances. Initially we develop single-layer networks, then we extend the implementation to multi-layer networks. Next, we develop an algorithm variation to handle input sample errors, thus defining a training algorithm for the more general neural network error-in-variables model. Moreover, we implement and analyze the behaviour of a training algorithm based on a different optimization method known as PI-controlled multipliers optimization. We begin by testing the codes on straightforward optimization problems and later we move to more complex systems. A detailed profile analysis allow us to identify the bottleneck in both the MATLAB and Python code, which, in both cases, is the computation of the Jacobian of the constraints. We propose a comparison where we compute the Jacobian directly by using analytical partial derivatives; notably, this shows that the MATLAB code has better performances in terms of training time with respect to the Python script. However, a flexible Python code employing the more general Jacrev function from the PyTorch library enables us to exploit different activation functions, train more complex networks and leverage the power of the GPU. Moreover, we compare the use of NNOE network in combination with the feedback linearization controlled multiplier method training algorithm and the standard LSTM-based identification. We observe that the latter requires a longer training time while leading to similar results in terms of training and validation error.

Relatori: Sophie Fosson, Vito Cerone, Simone Pirrera, Diego Regruto Tomalino
Anno accademico: 2023/24
Tipo di pubblicazione: Elettronica
Numero di pagine: 130
Informazioni aggiuntive: Tesi secretata. Fulltext non presente
Soggetti:
Corso di laurea: Corso di laurea magistrale in Mechatronic Engineering (Ingegneria Meccatronica)
Classe di laurea: Nuovo ordinamento > Laurea magistrale > LM-25 - INGEGNERIA DELL'AUTOMAZIONE
Aziende collaboratrici: NON SPECIFICATO
URI: http://webthesis.biblio.polito.it/id/eprint/29332
Modifica (riservato agli operatori) Modifica (riservato agli operatori)