Mattia Sabato
Higher-Order Message Passing for Structured Video Understanding of Human Dynamics.
Rel. Giuseppe Bruno Averta, Francesca Pistilli, Simone Alberto Peirone, Giulia Fracastoro. Politecnico di Torino, Master of science program in Data Science And Engineering, 2026
|
|
PDF (Tesi_di_laurea)
- Thesis
Restricted to: Only staff users fino al 27 March 2029 (data di embargo). Licence: Creative Commons Attribution Non-commercial No Derivatives. Download (3MB) |
Abstract
Structured Video Understanding aims to generate structured representations of events occurring in a video. A common approach relies on Graph Neural Networks (GNNs) trained on scene graphs describing the entities and relationships present in individual frames. By exploiting relational information, these models support downstream tasks such as action recognition, video question-answering and scene graph generation. However, a key architectural limitation of standard GNNs lies in their reliance on pairwise interactions, which may be insufficient to capture the complex dynamics underlying human activities. To address this limitation, we investigate architectures based on Higher-Order relationships, enabling models to learn more abstract representations derived from interactions among multiple entities considered jointly.
We introduce a new dataset comprising around seven million frames extracted from the Kitchens subset of Ego4D, each annotated with a corresponding scene graph
Relators
Academic year
Publication type
Number of Pages
Course of studies
Classe di laurea
URI
![]() |
Modify record (reserved for operators) |
