Jacopo Spaccatrosi
CLIP-MD: Causal and Low Latency Inference for Procedural Mistake Detection.
Rel. Francesca Pistilli, Giuseppe Bruno Averta, Gaetano Salvatore Falco. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2026
|
Preview |
PDF (Tesi_di_laurea)
- Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives. Download (11MB) | Preview |
Abstract
Procedural Mistake Detection (PMD) aims to identify deviations from an expected multi-step workflow while a user is performing a task. In egocentric settings, this requires strictly causal inference, low latency, and robustness to visual noise such as occlusions and fine-grained hand–object interactions. In this thesis, PMD is formulated as a One-Class Classification problem: models are trained only on correct executions and must detect any deviation at inference time. We adopt a dual-branch online pipeline where (i) a step recognition branch performs Online Action Detection (OAD) and (ii) a step anticipation branch predicts the next expected step from the recognised history; a mistake is signalled when executed and expected steps disagree.
For online step recognition, we analyse lightweight recurrent baselines and agnostic multimodal LLM-based approaches, showing that the latter remain unsuitable for real-time OAD due to limited fine-grained accuracy and low throughput
Relatori
Anno Accademico
Tipo di pubblicazione
Numero di pagine
Corso di laurea
Classe di laurea
Aziende collaboratrici
URI
![]() |
Modifica (riservato agli operatori) |
