Language and Vision models for PET Assisted Reporting

Jacopo Bracci

Language and Vision models for PET Assisted Reporting.

Rel. Flavio Giobergia, Nicolo' Capobianco. Politecnico di Torino, Corso di laurea magistrale in Data Science And Engineering, 2024

Abstract

PET and CT scans are essential diagnostic tools for detecting tumors in patients and aiding doctors worldwide in making diagnoses. Over time, advanced algorithms, such as convolutional neural networks, have been introduced to automatically detect lesions, leading to the development of lesion segmentation models. While these models are now helping physicians improve diagnostic accuracy, they rely solely on images as input, thus utilizing only the visual modality. However, it is common practice for physicians to write textual reports that describe their findings. These findings include comprehensive descriptions of the identified lesions, specifying important details such as anatomical location, dimensions, and notable characteristics.

Moreover, this information is reliable because it is provided by experienced physicians ensuring that the descriptions are not only accurate but also clinically relevant