Automating Upper Limb Activity Labeling in Egocentric Video: A Deep Learning Strategy

Marta De Iasi

Automating Upper Limb Activity Labeling in Egocentric Video: A Deep Learning Strategy.

Rel. Danilo Demarchi, Paolo Bonato, Giulia Corniani. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Biomedica, 2024

Preview

PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.
Download (7MB) | Preview

Abstract:	The advent of cutting-edge medical technologies and telehealth services has resulted in an explosion of health-related data, highlighting the urgent need for efficient data annotation in healthcare research. Manually labeling video footage to identify specific actions or features in medical imaging is both time-consuming and requires specialized expertise, causing significant delays in research progress. This thesis addresses this challenge by focusing on the annotation of upper limb movements in egocentric video data. It introduces an innovative minimally-supervised deep learning system designed to streamline this process. The proposed framework analyzes video recordings from head-mounted cameras capturing individuals performing everyday tasks. Central to the system are two key components: the Hand Object Detector (HOD) and the Snorkel model. The HOD, based on Faster R-CNN and CNN architectures, excels in identifying hands and their interactions with objects. Complementarily, Snorkel generates probabilistic labels for unlabeled data by applying custom labeling functions tailored to the observed actions. The pipeline enhances these models with customized modules and crucially integrates a Large Language Model (LLM) to support the labeling functions in Snorkel, thereby improving the accuracy of the functions by refining the results based on the output of HOD. This combination significantly reduces the need for manual annotation, automating much of the video labeling process. To validate the approach, the framework was applied to a carefully curated dataset. The results demonstrate its capability to accurately detect hand-object interactions and classify various hand activities, proving particularly beneficial for monitoring upper limb function in stroke survivors. This advancement marks a significant breakthrough in medical data annotation. By automating the identification and categorization of hand movements, the method not only reduces the manual workload but also enhances the precision of healthcare-focused machine learning models. Moreover, it offers a scalable solution to manage the ever-increasing volume of medical data. This innovative approach highlights the transformative potential of minimally-supervised deep learning and LLM in medical video annotation. It is poised to accelerate the development of advanced medical technologies and enhance patient care strategies, addressing a critical need in the rapidly evolving landscape of healthcare research and practice.
Relatori:	Danilo Demarchi, Paolo Bonato, Giulia Corniani
Anno accademico:	2023/24
Tipo di pubblicazione:	Elettronica
Numero di pagine:	80
Soggetti:
Corso di laurea:	Corso di laurea magistrale in Ingegneria Biomedica
Classe di laurea:	Nuovo ordinamento > Laurea magistrale > LM-21 - INGEGNERIA BIOMEDICA
Ente in cotutela:	Spaulding Research Institute (STATI UNITI D'AMERICA)
Aziende collaboratrici:	Spaulding Rehabilitation Hospital, Harvard Medical School
URI:	http://webthesis.biblio.polito.it/id/eprint/32139

Modifica (riservato agli operatori)