Politecnico di Torino (logo)

EVEgo: Egocentric Event-data for cross-domain analysis in first-person action recognition

Gabriele Goletto

EVEgo: Egocentric Event-data for cross-domain analysis in first-person action recognition.

Rel. Barbara Caputo, Mirco Planamente, Chiara Plizzari, Matteo Matteucci. Politecnico di Torino, Corso di laurea magistrale in Data Science And Engineering, 2021

PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (25MB) | Preview

Event cameras are novel bio-inspired sensors, which asynchronously capture pixel-level intensity changes in the form of “events". The innovative way they acquire data presents several advantages over standard devices, especially in poor lighting and high-speed motion conditions. In particular, their high pixel bandwidth results in reduced motion blur, and their high dynamic range make them a suitable alternative to traditional cameras when dealing with challenging robotics scenarios. Moreover, the latency and low power consumption of these novel sensors enable their use in real-world applications. Indeed, those peculiarities make them perfect to tackle well-known issues which arise from the use of wearable devices, such as fast camera motion and background clutter. However, their potential in these applications, such as egocentric action recognition, is still underexplored. In this work, we bring to light the potentiality of event sensors in first-person action recognition, showing the advantages that they offer over traditional cameras. Specifically, the latter suffer from egomotion, a phenomenon arising from the rapid and involuntary motion of the wearable device, which inevitably moves around with the user. Indeed, this characteristic enables event cameras to extract continuous information from the video. The recent release of the EPIC-Kitchen large-scale dataset, comprehensive of multiple input modalities, i.e., audio, RGB and optical flow, offers the possibility to show the advantages of the event modality over the traditional ones from the first person viewpoint. In this thesis, we propose an Event version of the large scale EPIC-Kitchens dataset, unlocking the possibility to explore the behavior of event data in First Person Action Recognition scenarios. Extensive experiments have been carried out by repurposing a variety of popular action recognition architectures in conjunction with recent Domain Adaptation methods. Those show the potentiality of event data in both intra- and cross-domain scenarios, establishing a large egocentric action recognition benchmark.

Relators: Barbara Caputo, Mirco Planamente, Chiara Plizzari, Matteo Matteucci
Academic year: 2021/22
Publication type: Electronic
Number of Pages: 141
Corso di laurea: Corso di laurea magistrale in Data Science And Engineering
Classe di laurea: New organization > Master science > LM-32 - COMPUTER SYSTEMS ENGINEERING
Aziende collaboratrici: Politecnico di Torino
URI: http://webthesis.biblio.polito.it/id/eprint/20568
Modify record (reserved for operators) Modify record (reserved for operators)