Maria Rosa Scoleri
Towards Egocentric Scene Graph Understanding with Graph Neural Networks.
Rel. Tatiana Tommasi, Antonio Alliegro. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2024
|
Preview |
PDF (Tesi_di_laurea)
- Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives. Download (27MB) | Preview |
Abstract
Egocentric vision is a domain of computer vision centered on video data captured from wearable devices such as head-mounted cameras. Videos from the user's viewpoint offer unique insights into human behavior and environmental contexts, with applications in augmented reality, activity recognition, and human-computer interaction. This thesis aims to develop a model to extract relevant features from egocentric videos exploiting labels constructed using scene graphs, which summarize the content of a given frame with verb-object-relationship triplets. Moreover, we propose a novel approach to the action anticipation task using graph-structured encoded data. We employ a Graph Neural Network (GNN) where visual features extracted from video frames serve as GNN nodes, while edges model the relationships between them.
The training of the GNN employs verb-object-relationship triplets as labels, allowing the model to learn relevant frame features for egocentric tasks
Relatori
Anno Accademico
Tipo di pubblicazione
Numero di pagine
Corso di laurea
Classe di laurea
URI
![]() |
Modifica (riservato agli operatori) |
