EGO-T^3: Test Time Training for Egocentric videos

Simone Alberto Peirone

EGO-T^3: Test Time Training for Egocentric videos.

Rel. Barbara Caputo, Mirco Planamente, Chiara Plizzari. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2022

Preview

PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.
Download (10MB) | Preview

Abstract

In the last few years, the technological advancement of wearable cameras has led to an increasing interest in egocentric (first-person) vision. The ability to capture activities from the user's perspective has provided significant opportunities for a more in-depth study of human behavior compared to the third-person setting, as sensors are much closer to actions and embed a natural form of attention that stems from the human gaze direction. The research community highly benefited from egocentric vision for a variety of different tasks, such as human-object interaction, action prediction and anticipation, wearer pose estimation, and video anonymization. A crucial aspect for several video-related tasks is their multimodal nature.

Audio, RGB, and optical flow provide complementary insights that are critical to a thorough understanding of the real world