polito.it
Politecnico di Torino (logo)

Training Kernel Neural ODEs with optimal control and Riemannian optimization

Matteo Raviola

Training Kernel Neural ODEs with optimal control and Riemannian optimization.

Rel. Claudio Canuto, Fabio Nobile. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Matematica, 2022

[img]
Preview
PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (13MB) | Preview
Abstract:

Nowadays, Machine Learning pipelines permeate the scientific computing world. The flexibility of Neural Networks makes them a formidable tool to perform numerous kinds of tasks, however their training keeps proving to be a computationally challenging optimization problem. This thesis focuses on a specific kind of Neural ODEs, Kernel Neural ODEs (KerODEs), where the usual parametric non-linearities are replaced by elements of a reproducing kernel Hilbert space (RKHS) fixed a priori. Classical training algorithms are based on a variant of stochastic gradient descent, coupled with the celebrated backpropagation algorithm for gradient computations. Though extremely versatile, these approaches potentially suffer from long computational times and/or high cost per iteration. We propose and numerically explore methodologies to overcome both of these issues for the optimization of KerODE parameters in the context of a regression task. In particular, we first exploit the dynamical systems perspective of Deep Learning to link the training problem to an optimal control problem and, in turn, reduce it to the solution of a two-point boundary value problem. Inspired by time-parallel ODE integration techniques, we develop a multi-grid algorithm to speed up optimization and numerically investigate its performance. As an alternative approach, we formally introduce a differential structure on the family of mappings realized by KerODEs, enabling the use of continuous Riemanninan optimization techniques to solve the training problem. This furnishes a new and compelling perspective on Neural ODEs as the realization of a Riemannian gradient/Newton flow which, in practice, leads to layer-by-layer optimization techniques thus alleviating the cost per iteration of classic approaches.

Relatori: Claudio Canuto, Fabio Nobile
Anno accademico: 2022/23
Tipo di pubblicazione: Elettronica
Numero di pagine: 122
Soggetti:
Corso di laurea: Corso di laurea magistrale in Ingegneria Matematica
Classe di laurea: Nuovo ordinamento > Laurea magistrale > LM-44 - MODELLISTICA MATEMATICO-FISICA PER L'INGEGNERIA
Ente in cotutela: École Polytechnique Fédérale de Lausanne (SVIZZERA)
Aziende collaboratrici: ECOLE POLYTECHNIQUE FEDERALE DE LAUSANNE
URI: http://webthesis.biblio.polito.it/id/eprint/24047
Modifica (riservato agli operatori) Modifica (riservato agli operatori)