polito.it
Politecnico di Torino (logo)

Using neural network models and subject-driven text-to-image generation techniques to classify and generate cinematographic shots

Arefeh Mohammad Nejad

Using neural network models and subject-driven text-to-image generation techniques to classify and generate cinematographic shots.

Rel. Tania Cerquitelli. Politecnico di Torino, NON SPECIFICATO, 2024

Abstract:

Cinematographic shots are the building blocks of visual storytelling in filmmaking. They represent the individual frames or images captured by a camera to create a sequence that forms a movie. Subject-driven text-to-image generation is a cutting-edge field at the intersection of artificial intelligence and visual arts. It involves leveraging advanced machine learning models to create images based on textual descriptions or prompts. By interpreting and understanding the context provided in the text, these models generate visual content that reflects the essence of the described subject. In particular, in tasks such as AI-assisted video editing and storyboarding, it is essential to be able to produce images with a user-specified shot type (A storyboard is a visual representation of how a story will play out, scene by scene. It's made up of a chronological series of images, with accompanying notes). In this work, five cinematographic shot types including Close Up, Medium Close Up, Medium Shot, Medium Long shot, and Long Shot were considered. I performed cinematographic shot classification using different neural network models. Furthermore, the DreamBooth model was utilized for generating images with a specific shot type specified in the text prompt, and I managed to produce high quality and highly detailed images with the desired shot type. Moreover, I was able to generate images of a specific subject such as an actor for each of the five shot types considered.

Relatori: Tania Cerquitelli
Anno accademico: 2023/24
Tipo di pubblicazione: Elettronica
Numero di pagine: 70
Informazioni aggiuntive: Tesi secretata. Fulltext non presente
Soggetti:
Corso di laurea: NON SPECIFICATO
Classe di laurea: Nuovo ordinamento > Laurea magistrale > LM-32 - INGEGNERIA INFORMATICA
Aziende collaboratrici: Politecnico di Torino
URI: http://webthesis.biblio.polito.it/id/eprint/31001
Modifica (riservato agli operatori) Modifica (riservato agli operatori)