Politecnico di Torino (logo)

Towards universal lightweight models for image segmentation

Gabriele Rosi

Towards universal lightweight models for image segmentation.

Rel. Giuseppe Bruno Averta, Fabio Cermelli, Barbara Caputo, Antonio Tavera. Politecnico di Torino, Corso di laurea magistrale in Data Science And Engineering, 2023

[img] PDF (Tesi_di_laurea) - Tesi
Restricted to: Repository staff only until 28 July 2026 (embargo date).
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (21MB)

Image segmentation is an essential task in computer vision that plays an important role in a variety of applications such as object recognition, autonomous driving, medical imaging, and others. It involves three segmentation tasks: semantic segmentation, which classifies individual pixels in an image into specific classes; instance segmentation, which detects and classifies each object instance in the image; and panoptic segmentation, which combines semantic and instance segmentation to identify and classify both pixels and object instances in the image. In recent decades, research has focused on designing specialized architectures for each of these fundamental tasks, and state-of-the-art results were achieved across different datasets and domains. However, when switching between two tasks, the architecture, the training methods, the losses, and the implementation details still need to be modified to achieve good results. Universal architectures try to fill this gap by proposing a unified framework with a single architecture that can be used for semantic segmentation, instance segmentation, and panoptic segmentation without any modification except that a retraining. Furthermore, the need to develop small and fast computer vision models for mobile devices is becoming increasingly important. Due to limited computing power, small available memory, and power constraints, such devices cannot handle the current state-of-the-art image segmentation models due to their size and complexity. In addition, these devices are often used in real-time scenarios where inference speed is critical. In this work, we address both of the aforementioned problems and make a step towards a universal mobile image segmentation architecture. In particular, starting from a state-of-the-art image segmentation model, we identify the most demanding part of the architecture in terms of computational complexity and we make selected modification to make them more lightweight. Moreover, we present two new lightweight components to improve our architecture and further reduce the computational complexity with respect to the original architecture. Our final model significantly reduces FLOPs by nearly 8 times and parameter count by 3 times compared to the initial implementation, while achieving good performance in semantic segmentation (75.7 mIoU) and panoptic segmentation (41.3 PQ) on the Cityscapes dataset.

Relators: Giuseppe Bruno Averta, Fabio Cermelli, Barbara Caputo, Antonio Tavera
Academic year: 2022/23
Publication type: Electronic
Number of Pages: 91
Corso di laurea: Corso di laurea magistrale in Data Science And Engineering
Classe di laurea: New organization > Master science > LM-32 - COMPUTER SYSTEMS ENGINEERING
Aziende collaboratrici: UNSPECIFIED
URI: http://webthesis.biblio.polito.it/id/eprint/27696
Modify record (reserved for operators) Modify record (reserved for operators)