polito.it
Politecnico di Torino (logo)

Image generation using deep adversarial generative models on graphs

Michele D'Amico

Image generation using deep adversarial generative models on graphs.

Rel. Enrico Magli. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2020

[img]
Preview
PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (3MB) | Preview
Abstract:

Generative Adversarial Networks (GANs) are a very promising category of generative models used to approximate unknown data distributions for sampling purposes. Nevertheless, their training instability problems have hindered the possibility of experimenting with a wide variety of different GANs architectures. The introduction of Wasserstein GAN and Wasserstein GAN-GP overcomes such limitation providing the possibility to successfully train a broader class of architectures without instability or convergence issues. Among the possible model architectures for image generation task, convolutional neural networks (CNN) excels as for many other subfields in deep learning. Notwithstanding the nice properties of convolutional layers, which are the building blocks of CNNs, the convolution is a local operator and for this reason lacks to effectively capture long-term dependencies, which are fundamental for reproducing plausible samples for image classes that present a well-determined structure. To this end, this project proposes the integration of the graph convolution operation in the generator of a convolutional WGAN-GP in an attempt to remedy this limitation. The graph-convolutional layer will extract a graph representation of image data dynamically, generating a k-nearest neighbor graph. In this representation, each vertex has its vector of features taken from the activation maps and is connected to the k less distant nodes. The distance is determined through the Euclidean metric in the feature space rather than in the spatial domain as for regularly structured data. Consequently, the convolution is performed as a node aggregation function among the central node and its neighborhood of size k. Thus, this operator would result in an adaptive receptive field on the areas of the hidden layers activation maps that share some features similarities with the central node of convolution. The graph convolution will not substitute regular convolution, but instead it will extend in a complementary way its receptive field to capture also non-local dependencies. From the experiments carried out, however it emerges that this method does not provide the expected improvements. In fact, from an evaluation of the generated samples based on the inception score and on the naked-eye observation, samples generated by the network with graph convolution are very similar to baseline samples obtained through a fully convolutional WGAN-GP.

Relatori: Enrico Magli
Anno accademico: 2019/20
Tipo di pubblicazione: Elettronica
Numero di pagine: 84
Soggetti:
Corso di laurea: Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering)
Classe di laurea: Nuovo ordinamento > Laurea magistrale > LM-32 - INGEGNERIA INFORMATICA
Aziende collaboratrici: NON SPECIFICATO
URI: http://webthesis.biblio.polito.it/id/eprint/15334
Modifica (riservato agli operatori) Modifica (riservato agli operatori)