Transformer-based model evaluation of reduced left ventricular function using apical 4-chamber and parasternal long axis echocardiographies compared with a 3D CNN

Andrea Bedetti

Transformer-based model evaluation of reduced left ventricular function using apical 4-chamber and parasternal long axis echocardiographies compared with a 3D CNN.

Rel. Gabriella Olmo. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Biomedica, 2025

Abstract:	Cardiovascular disease is a leading cause of death globally, with more than 19 million deaths recorded in 2021, particularly affecting the elderly population. Among the various associated pathological conditions, one of the most prominent is the reduced ability of the left ventricle to contract effectively, impairing the heart's capacity to pump blood adequately throughout the body. Clinically, this dysfunction is often manifested by symptoms of heart failure and fatigue. Thus, a rapid method of screening is essential for early detection and timely intervention. Echocardiography is the current clinical standard for the diagnosis of cardiovascular disease because of its ability to provide real-time imaging and high temporal resolution. However, assessment of left ventricular function has high intra- and inter-operator variability, which may compromise diagnostic accuracy. In recent years, artificial intelligence-based models have revolutionized medical image analysis, automating the diagnostic process and improving efficiency without compromising accuracy. In particular, models based on Transformer architectures have achieved promising results through the use of attention mechanisms, which can capture subtle spatial differences between frames within the same video. The purpose of this work is to evaluate the performance of a transformer-based deep learning model in classifying reduced left ventricular function (RLVF) from apical 4-chamber (A4C) and parasternal long axis (PLAX) echocardiography and then compare it with that of a 3D CNN architecture. In a first step, a dataset consisting of 110 A4C echocardiographic videos was used to train, validate, and test a deep learning model consisting of a ResNet and a temporal transformer encoder. A geometric data augmentation was then investigated to explore its impact on model robustness and to enable comparison with the performance of an R(2+1)D classifier. Our model identified RLVF cases with an accuracy of 82.6% and an F1-score of 77.8%. Next, to evaluate the performance of the model on a larger dataset, it was trained on a mixed set containing A4C and PLAX videos of the same patients, obtaining an accuracy of 80.4% and an F1-score of 74.3%. Finally, a temporal validation was conducted to analyze the generalizability of the model on the two datasets. In conclusion, the study demonstrates that 3D-CNN remains more accurate in classifying RLVF. The transformer model, while achieving superior performance in the case of a single dataset, shows less generalizability than training with a heterogeneous dataset. Both models, however, have proven to be valuable tools for clinical decision support, offering promising results in automated evaluation of echocardiography.
Relatori:	Gabriella Olmo
Anno accademico:	2024/25
Tipo di pubblicazione:	Elettronica
Numero di pagine:	76
Informazioni aggiuntive:	Tesi secretata. Fulltext non presente
Soggetti:
Corso di laurea:	Corso di laurea magistrale in Ingegneria Biomedica
Classe di laurea:	Nuovo ordinamento > Laurea magistrale > LM-21 - INGEGNERIA BIOMEDICA
Aziende collaboratrici:	Teoresi SPA
URI:	http://webthesis.biblio.polito.it/id/eprint/36123

Modifica (riservato agli operatori)