Politecnico di Torino (logo)

Development and evaluation of a Machine Learning pipeline for the generation of video annotations

Luca Spagnuolo

Development and evaluation of a Machine Learning pipeline for the generation of video annotations.

Rel. Danilo Demarchi, Paolo Bonato, Giulia Corniani. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Elettronica (Electronic Engineering), 2023

PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (23MB) | Preview

The rapid growth of machine learning techniques has promoted innovative advancements across various domains, including health, medicine, and rehabilitation. However, the effectiveness of these methods heavily relies on the availability of large and well-annotated datasets. In the realm of medicine, a common challenge lies in the existence of extensive datasets, which often lack comprehensive labeling. This limitation hampers the progress and deployment of automated algorithms across various medical applications. This thesis addresses the challenge of unlabeled video datasets by proposing a novel approach for generating automatic labels in the context of monitoring the upper limb activity in stroke patients. Leveraging developments in deep learning and computer vision, the proposed framework extracts relevant features from video sequences and inputs them into Snorkel to generate labeled data of hand activity. By utilizing weakly supervised learning techniques, the framework is designed to effectively learn from limited annotated samples and generalize to unlabeled data. The study begins by exploring state-of-the-art machine learning architectures that can learn from scarce data and focus on weakly supervised machine learning along with the generative model used by Snorkel. To evaluate the effectiveness of the proposed approach, a comprehensive dataset of upper limb activity of stroke patient video recordings is analyzed, and quantitative and qualitative assessments are conducted to compare the performance of the automated label generation framework against manual annotations. The metrics include accuracy, Intersection over Union, F1-score, and confusion matrices. Visual comparisons of generated labels and ground truth annotations provide insights into the system’s interpretability. Overall, the pipeline achieved an F1-score of 76%. The results of this study offer an effective solution to the issue of limited labeled data in stroke patient video analysis. The proposed framework showcases the potential of harnessing the growth of machine learning in rehabilitation, even when confronted with large unlabeled datasets. Furthermore, the methodologies developed can serve as a blueprint for addressing similar challenges in other medical domains requiring video data analysis with limited labeled samples.

Relators: Danilo Demarchi, Paolo Bonato, Giulia Corniani
Academic year: 2023/24
Publication type: Electronic
Number of Pages: 74
Corso di laurea: Corso di laurea magistrale in Ingegneria Elettronica (Electronic Engineering)
Classe di laurea: New organization > Master science > LM-29 - ELECTRONIC ENGINEERING
Ente in cotutela: Motion Analysis Laboratory, Spaulding Rehabilitation Hospital (STATI UNITI D'AMERICA)
Aziende collaboratrici: Spaulding Rehabilitation Hospital
URI: http://webthesis.biblio.polito.it/id/eprint/28654
Modify record (reserved for operators) Modify record (reserved for operators)