polito.it
Politecnico di Torino (logo)

Graph Neural Network for Event-based Vision

Daniele Busacca

Graph Neural Network for Event-based Vision.

Rel. Luciano Lavagno, Fabrizio Ottati, Muhammad Usman, Filippo Minnella. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Elettronica (Electronic Engineering), 2023

[img]
Preview
PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (3MB) | Preview
Abstract:

In recent years, event cameras, also known as silicon retinas, have emerged as a novel paradigm for capturing visual information in a sparse and asynchronous way, offering significant advantages in applications such as robotics and computer vision. However, to exploit the full potential of event cameras the development of innovative algorithms is required. The most effective learning algorithms developed for event cameras typically use Spiking Neural Networks (SNNs) for an event-by-event processing or start by transforming events into dense representations, which are subsequently processed using conventional Convolutional Neural Networks (CNNs). Nonetheless, the SNNs don't provide a back-propagation learning mechanism and the CNNs result in the loss of both the inherent sparsity and the fine-grained temporal resolution of events imposing a substantial computational load and latency introduction. For this regard, this thesis proposes a Machine Learning (ML) algorithm based on Graph Neural Networks (GNNs) to process event data streams from event cameras. GNNs work on data with irregular shape and dimension and they can process events as spatio-temporal graphs, which are inherently sparse. The primary focus is on event classification, which involves determining the class to which input data belongs based on a model trained on a dataset of event streams. The study utilizes the IBM DVS Gesture dataset which, although event-based, is transformed into a graph-based dataset to make it compatible with GNNs. This conversion involves a preprocessing phase, composed by several sub-steps such as event sub-stream selection, sub-sampling, time normalization and graph creation. Part of the whole event stream coming from an hand gesture is therefore sampled, discretized in the time domain and then used to create an event-graph using the radius-neighborhood algorithm. Each of these sub-steps is characterized by one or more parameters, which can heavily affect the system performance. The primary objective of this research is therefore to classify event-graphs generated from event sub-streams. The performance of the model is given by its accuracy, evaluated based on the ratio of correctly predicted labels to the total processed data. The model is comprised of graph convolutional, batch normalization and pooling layers followed by a Multi-Layer Perceptron for classification and its parameters are refined by a preliminary learning process which is characterized of several hyper-parameters. The initial preprocessing phase relies on Python libraries such as Numpy, while the subsequent stages, including the GNN model architecture design, training, and evaluation, are conducted using PyTorch and PyTorch Geometric libraries.

Relatori: Luciano Lavagno, Fabrizio Ottati, Muhammad Usman, Filippo Minnella
Anno accademico: 2023/24
Tipo di pubblicazione: Elettronica
Numero di pagine: 81
Soggetti:
Corso di laurea: Corso di laurea magistrale in Ingegneria Elettronica (Electronic Engineering)
Classe di laurea: Nuovo ordinamento > Laurea magistrale > LM-29 - INGEGNERIA ELETTRONICA
Aziende collaboratrici: NON SPECIFICATO
URI: http://webthesis.biblio.polito.it/id/eprint/29450
Modifica (riservato agli operatori) Modifica (riservato agli operatori)