Mattia Delleani
Structured latent embeddings for generating and reposing DXA images.
Rel. Lia Morra. Politecnico di Torino, Corso di laurea magistrale in Data Science And Engineering, 2022
Abstract: |
Recent advances in machine learning allow to map image features with semantic descriptions into aligned latent representations. These representations are useful in that they capture the "essence" of the observed and described elements, allowing generalization for unseen domains and classes. In addition, they provide the means that allow both: i) to generate new images from arbitrary semantic descriptions and ii) to generate semantic descriptions from input images. In this thesis, we aim at studying the applications of such tools in the medical context. For this purpose, the DXA (dual-energy X-ray Absorptiometry) scans are used. DXAs capture subtle characteristics of patients' body structures which are difficult to notice and analyze by humans but are important for the holistic evaluation of the subject. We strive to develop a model whose latent space captures the subtle characteristics of patients such as pose, the orientation of body parts, shape and structure of the body, etc. This task is challenging since when the data distribution has subtle differences, it is difficult to develop a structured and discriminative latent space due to mode collapse. In order to do that, a Variational Auto Encoder (VAE), which is a well-known generative probabilistic model, is leveraged. The traditional VAE is not sufficient to build a structured and generic latent space. Thus, starting from the VAE, new architectures are developed exploiting 3D human modeling components (STAR body model), more precisely parameters related to posing, translation, and shape of the human body. These data types have been leveraged in different ways in the architecture to train a pose-shape encoder and to enforce some constraints in the reconstruction. These constraints drove us to the modification of the initial VAE into a more complex architecture composed of a VAE with a Pose-Shape Encoder that can reconstruct images considering the shape and the pose of patients which was not true for the original VAE. This final model is also able, given a patient in a certain position, to re-pose it in a given input position. |
---|---|
Relators: | Lia Morra |
Academic year: | 2022/23 |
Publication type: | Electronic |
Number of Pages: | 98 |
Additional Information: | Tesi secretata. Fulltext non presente |
Subjects: | |
Corso di laurea: | Corso di laurea magistrale in Data Science And Engineering |
Classe di laurea: | New organization > Master science > LM-32 - COMPUTER SYSTEMS ENGINEERING |
Ente in cotutela: | INSTITUT NATIONAL POLYTECHNIQUE DE GRENOBLE (INPG) - ENSIMAG (FRANCIA) |
Aziende collaboratrici: | INRIA |
URI: | http://webthesis.biblio.polito.it/id/eprint/24693 |
Modify record (reserved for operators) |