Mattia Sabato
Higher-Order Message Passing for Structured Video Understanding of Human Dynamics.
Rel. Giuseppe Bruno Averta, Francesca Pistilli, Simone Alberto Peirone, Giulia Fracastoro. Politecnico di Torino, Corso di laurea magistrale in Data Science And Engineering, 2026
|
|
PDF (Tesi_di_laurea)
- Tesi
Accesso limitato a: Solo utenti staff fino al 27 Marzo 2029 (data di embargo). Licenza: Creative Commons Attribution Non-commercial No Derivatives. Download (3MB) |
Abstract
Structured Video Understanding aims to generate structured representations of events occurring in a video. A common approach relies on Graph Neural Networks (GNNs) trained on scene graphs describing the entities and relationships present in individual frames. By exploiting relational information, these models support downstream tasks such as action recognition, video question-answering and scene graph generation. However, a key architectural limitation of standard GNNs lies in their reliance on pairwise interactions, which may be insufficient to capture the complex dynamics underlying human activities. To address this limitation, we investigate architectures based on Higher-Order relationships, enabling models to learn more abstract representations derived from interactions among multiple entities considered jointly.
We introduce a new dataset comprising around seven million frames extracted from the Kitchens subset of Ego4D, each annotated with a corresponding scene graph
Relatori
Anno Accademico
Tipo di pubblicazione
Numero di pagine
Corso di laurea
Classe di laurea
URI
![]() |
Modifica (riservato agli operatori) |
