Politecnico di Torino (logo)

Improving Mammography Triage through Self-Supervised Pre-Training

Mattia Lisciandrello

Improving Mammography Triage through Self-Supervised Pre-Training.

Rel. Lia Morra, Fabrizio Lamberti. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2022


Breast cancer represents one of the the most common forms of cancer today, and diagnosing it early is pivotal to reduce mortality. In early cancer prevention programs, women undergo biennial exams in which up to four, high-resolution images are acquired: these four views are acquired through two different projections, namely bilateral craniocaudal (CC) and mediolateral oblique (MLO) views, and for each side of the breast. These exams have to be reviewed by radiologists: however, due to the huge amount of women that undergo screening mammography, doctors are often supported by computer aided detection systems (CAD) to hasten this process. Deep learning-based techniques could be used for the automated screening of digital mammography images in order to identify cases with high probability of being normal, and secondarily identify lesions. The problem, defined as the triage task, is technically challenging since mammography images are much larger than traditional RGB images, and subtle information from multiple views must be integrated in a single prediction; for these reasons the trained networks must operate in robust and interpretable ways. The goal of this thesis is to use Self-Supervised Pre-Training to increase the performance of deep neural networks in the triage task, starting from pre-existing architectures. Self-supervised learning is a learning method where a surrogate supervised task is created out of the unlabeled data. Self-supervised techniques are used to pre-train the model, which is then fine-tuned to transfer the captured knowledge to a different target task, which in our case is mammography triage. The implementation of one such technique, called Swapping Assignments between Views (SWaV), is discussed in detail in this thesis. SWaV clusters the data while enforcing consistency between cluster assignments produced for different augmentations of the same image. The result of the pre-training has been thoroughly analyzed and then used to enhance two deep neural networks which suffered heavily from overfitting: the breast cancer classifier realized by New York University (NYU) and the architecture known as Anatomy-aware Graph convolutional Network (AGN). SWaV was trained on the Karolinska Dataset, which includes exams from approximately 11000 women of different ages, and the publicly available CBIS-DDSM, with around 1500 participants. To train SWaV, patches were generated starting from the said dataset, removing patches that mostly included background. At the end of this work, an analysis of the prototypes generate by SWaV is presented and discussed in depth along with the results obtained through pre-training both for NYU and AGN. Results show that self-supervised pre-training improves the performance of both architectures, calculated with the Area under the ROC Curve (AUC), which measures the ability of a classifier to distinguish between classes.

Relators: Lia Morra, Fabrizio Lamberti
Academic year: 2022/23
Publication type: Electronic
Number of Pages: 105
Additional Information: Tesi secretata. Fulltext non presente
Corso di laurea: Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering)
Classe di laurea: New organization > Master science > LM-32 - COMPUTER SYSTEMS ENGINEERING
Aziende collaboratrici: UNSPECIFIED
URI: http://webthesis.biblio.polito.it/id/eprint/24695
Modify record (reserved for operators) Modify record (reserved for operators)