polito.it
Politecnico di Torino (logo)

Few-shot Learning in Vision Transformers for Skin Cancer Semantic Segmentation

Francesco Di Gangi

Few-shot Learning in Vision Transformers for Skin Cancer Semantic Segmentation.

Rel. Tatiana Tommasi. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2024

[img]
Preview
PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (21MB) | Preview
[img] Archive (ZIP) (Documenti_allegati) - Altro
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (1kB)
Abstract:

Skin cancer is one of the most prevalent and potentially life-threatening diseases, characterized by aberrant skin cell proliferation, mostly caused by DNA damage due to exposure to ultraviolet (UV) radiation from the sun or other sources, like tanning beds. There are several types of skin cancer, such as melanoma, squamous cell carcinoma, and basal cell carcinoma, each with unique traits and implications for treatment and diagnosis. Timely detection of skin cancer is fundamental to maximize the probability of successful treatment. In this regard, computer-assisted diagnosis plays a crucial role and is supported by the automated analysis of images through segmentation. Segmentation effectively recognizes and outlines regions of interest in dermoscopic images, aiding healthcare practitioners in precisely identifying and assessing lesions. This enables early diagnosis and the tracking of skin changes over time, ultimately facilitating prompt medical intervention. In the context of skin cancer, the ultimate goal of segmentation is to offer dermatologists and other medical professionals accurate and sophisticated tools to help them quickly identify and treat concerning skin lesions. The use of deep learning for image segmentation has been widely documented in the recent literature. In particular, Convolutional Neural Networks (CNNs) have been shown to be able to accurately separate skin lesions by autonomously learning from a training dataset and automatically recognizing significant characteristics in dermoscopic pictures. Although convolutional operators have been an essential component of image processing, Vision Transformers have become more popular due to their lack of ability to handle spatial variations and capture global context. ViTs are motivated by the success of Transformers in natural language processing and use a global attention mechanism that enables them to analyze images quickly by taking into account the relationships between all of the image's components, regardless of how far apart they are. As a result, ViTs are able to outperform CNNs in a variety of visual tasks and get around some of their shortcomings. The main limitation consists in the lack of the large amount of labelled data needed to train algorithms like Vision Transformers. This thesis focuses on exploring the use of Vision Transformer for image segmentation within a Few-shot learning paradigm, where a (potentially pre-trained) model can be further refined to make accurate predictions or perform specific tasks using only a very small amount of labeled training data, which is particularly appropriate in the described scenario. The experimental assessment will rely on publicly available datasets, such as the ISIC dataset, which contains dermoscopic images for research and development in dermatology, particularly for skin cancer detection and diagnosis. Due to the lack of sufficient medical data available in real-world scenarios, Cross Domain Segmentation methodologies combined with Few-shot Learning techniques are then implemented and studied. Experimental results will be presented and compared with most recent works, revealing that segmentation could potentially prove beneficial in both current and forthcoming medical endeavors.

Relatori: Tatiana Tommasi
Anno accademico: 2024/25
Tipo di pubblicazione: Elettronica
Numero di pagine: 81
Soggetti:
Corso di laurea: Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering)
Classe di laurea: Nuovo ordinamento > Laurea magistrale > LM-32 - INGEGNERIA INFORMATICA
Aziende collaboratrici: Keele University
URI: http://webthesis.biblio.polito.it/id/eprint/33129
Modifica (riservato agli operatori) Modifica (riservato agli operatori)