Arda Eren Dogru
Encoder-Only Multi-Task Learning.
Rel. Carlo Masone, Fabio Cermelli. Politecnico di Torino, Corso di laurea magistrale in Data Science And Engineering, 2025
| Abstract: |
Multi-task learning (MTL) promises to enhance the efficiency of computer vision systems by solving multiple tasks, such as dense prediction and recognition, within a single model. However, this paradigm is often hindered by negative transfer, where conflicting task objectives degrade performance a critical issue for real-time applications in fields like autonomous driving and robotics. This thesis challenges the prevailing assumption that mitigating negative transfer requires complex, task-specific decoders. We introduce a minimalist, encoder-only framework grounded in deep architectural unification: heterogeneous tasks share a single ViT backbone and a uniform query-to-mask spatial projection, with polymorphic lightweight heads producing task-specific outputs. Our approach synthesizes two powerful frameworks for multi-task learning. We adopt the query-based encoder only prediction approach from EoMT, which operates within the final layers of a DINOv2 (ViT) backbone. The resulting shared query embeddings are then interpreted by the polymorphic heads from PolyMaX to produce predictions for multiple, distinct dense prediction tasks. On challenging indoor and outdoor benchmarks (NYUv2 and Cityscapes), our model achieves a new state-of-the-art for overall multi-task performance. We provide strong evidence of positive transfer for segmentation, as joint training measurably improves segmentation quality, producing sharper and more geometrically coherent predictions. This work delivers an efficient, reproducible, and powerful baseline for dense MTL, proving that a heavily unified architecture can achieve state-of-the-art results while outperforming more complex designs, paving the way for more scalable and robust multi-task systems. |
|---|---|
| Relatori: | Carlo Masone, Fabio Cermelli |
| Anno accademico: | 2025/26 |
| Tipo di pubblicazione: | Elettronica |
| Numero di pagine: | 106 |
| Informazioni aggiuntive: | Tesi secretata. Fulltext non presente |
| Soggetti: | |
| Corso di laurea: | Corso di laurea magistrale in Data Science And Engineering |
| Classe di laurea: | Nuovo ordinamento > Laurea magistrale > LM-32 - INGEGNERIA INFORMATICA |
| Aziende collaboratrici: | FOCOOS AI S.R.L. |
| URI: | http://webthesis.biblio.polito.it/id/eprint/37832 |
![]() |
Modifica (riservato agli operatori) |



Licenza Creative Commons - Attribuzione 3.0 Italia