Arefeh Mohammad Nejad
Using neural network models and subject-driven text-to-image generation techniques to classify and generate cinematographic shots.
Rel. Tania Cerquitelli. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2024
Abstract: |
Cinematographic shots are the building blocks of visual storytelling in filmmaking. They represent the individual frames or images captured by a camera to create a sequence that forms a movie. Subject-driven text-to-image generation is a cutting-edge field at the intersection of artificial intelligence and visual arts. It involves leveraging advanced machine learning models to create images based on textual descriptions or prompts. By interpreting and understanding the context provided in the text, these models generate visual content that reflects the essence of the described subject. In particular, in tasks such as AI-assisted video editing and storyboarding, it is essential to be able to produce images with a user-specified shot type (A storyboard is a visual representation of how a story will play out, scene by scene. It's made up of a chronological series of images, with accompanying notes). In this work, five cinematographic shot types including Close Up, Medium Close Up, Medium Shot, Medium Long shot, and Long Shot were considered. I performed cinematographic shot classification using different neural network models. Furthermore, the DreamBooth model was utilized for generating images with a specific shot type specified in the text prompt, and I managed to produce high quality and highly detailed images with the desired shot type. Moreover, I was able to generate images of a specific subject such as an actor for each of the five shot types considered. |
---|---|
Relators: | Tania Cerquitelli |
Academic year: | 2023/24 |
Publication type: | Electronic |
Number of Pages: | 70 |
Additional Information: | Tesi secretata. Fulltext non presente |
Subjects: | |
Corso di laurea: | Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering) |
Classe di laurea: | New organization > Master science > LM-32 - COMPUTER SYSTEMS ENGINEERING |
Aziende collaboratrici: | Politecnico di Torino |
URI: | http://webthesis.biblio.polito.it/id/eprint/31001 |
Modify record (reserved for operators) |