Leonardo Dardanello
A Scene Graph-based approach for Text-to-Image generation.
Rel. Lia Morra. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2024
|
PDF (Tesi_di_laurea)
- Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives. Download (4MB) | Preview |
|
Archive (ZIP) (Documenti_allegati)
- Altro
Licenza: Creative Commons Attribution Non-commercial No Derivatives. Download (149kB) |
Abstract: |
With the advent of Generative Artificial Intelligence models, particularly Text-to-Image models, it has been possible to produce synthetic images of increasing quality and detail. However, the images produced have poor adherence to the spatial positioning of entities. This work presents a new text-to-image framework for the generation of synthetic images. In particular, and building on other work in the literature, it aims to improve the ability to understand and spatially position objects and subjects within the image. This is achieved by exploiting scene graphs, which describe the relationships between different entities in the image. In order to train the model comprehensively and effectively, the scene graphs were extracted from existing datasets and reported in a publicly available dataset, focusing on dense textual captions that are rich in spatial relationships. |
---|---|
Relatori: | Lia Morra |
Anno accademico: | 2024/25 |
Tipo di pubblicazione: | Elettronica |
Numero di pagine: | 70 |
Soggetti: | |
Corso di laurea: | Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering) |
Classe di laurea: | Nuovo ordinamento > Laurea magistrale > LM-32 - INGEGNERIA INFORMATICA |
Aziende collaboratrici: | NON SPECIFICATO |
URI: | http://webthesis.biblio.polito.it/id/eprint/34075 |
Modifica (riservato agli operatori) |