Few Shot Adaptation of VLM for Panoptic Segmentation

Masoud Karimi

Few Shot Adaptation of VLM for Panoptic Segmentation.

Rel. Andrea Bottino. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2024

Preview

PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.
Download (9MB) | Preview

Abstract:	This project aims to refine the SEEM (Segment Everything Everywhere All At Once Model) foundational model through advanced fine-tuning techniques and thorough dataset preparation. The dataset preparation process includes creating annotated ground truth images for panoptic learning and generating grounding an- notation files. These annotations encompass segmentation masks and bounding boxes, with each category uniquely colored. Grounding annotations link segmenta- tion details with descriptive sentences, establishing connections between segmenta- tions and their context. Recently, transformers have demonstrated significant suc- cess in various computer vision domains due to their dynamic modeling capabilities and long-range dependencies. Vision transformers have outperformed CNN models in tasks like object detection and semantic segmentation. To incorporate domain- specific knowledge into SEEM, we focused on fine-tuning the vision backbone, the segmentation head, or both, using adapters to preserve the model’s original parame- ters. This approach maintains the model’s initial capabilities while integrating new knowledge. Adapters have shown to be effective for fine-tuning large-scale models, enhancing their performance in both new and existing tasks by enabling transfer learning.
Relatori:	Andrea Bottino
Anno accademico:	2023/24
Tipo di pubblicazione:	Elettronica
Numero di pagine:	78
Soggetti:
Corso di laurea:	Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering)
Classe di laurea:	Nuovo ordinamento > Laurea magistrale > LM-32 - INGEGNERIA INFORMATICA
Aziende collaboratrici:	NON SPECIFICATO
URI:	http://webthesis.biblio.polito.it/id/eprint/31780

Modifica (riservato agli operatori)