Cycle-consistent Deep Learning Architecture for Improved Text Representation and Translation

Michele D'Addetta

Cycle-consistent Deep Learning Architecture for Improved Text Representation and Translation.

Rel. Luca Cagliero, Moreno La Quatra. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2021

Preview

PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.
Download (2MB) | Preview

Abstract:	This thesis presents CycleNLPGAN, a deep learning architecture that introduces an innovative approach to sentence encoding and alignment. It allows the definition of a latent vector space shared across a pair of languages. The model is jointly trained to perform neural machine translation from a source language A to target language B. It generates a shared aligned vector space suitable for machine translation from a source language to a target one and vice versa. The architecture is based on a CycleGAN, a Computer Vision model that address image-to-image translation using cycle-consistent dynamics. It enforces the robustness of the resulting model and the quality of produced data. The architecture is defined using a cycle consistency loss, an approach used in neural machine translation and in domain adaptation models. The designed model aims at creating two mapping functions GAB and GBA such that the sentences GAB(A) generated by GAB translated from source language A can be considered as indistinguishable original sentences in language B. The exact same principle holds for the inverse mapping function. The joint training of the two mapping functions and the introduction of the cycle consistency loss produces a model able to generate sentence embeddings that belong to the shared latent vector space. Two separate sentence embeddings are close in the latent space when the sentences are the same, or different languages sentences have similar meanings. At the same time, the model is able to produce accurate translation. The translation quality is enforced using the back translation, whose produces values defined as GAB(GBA(B)) and GBA(GAB(A)). Back translation values allow the definition of highly related mapping functions. In fact, duringthe training, each mapping function uses both real data and data defined fromthe opposite mapping function, in order to increase the relationship among them.The model is trained using an adversarial generative approach and uses a bilingual parallel dataset (Opus), which contains a large amount of aligned data for several language pairs. The effectiveness of CycleNLPGAN approach is demonstrated addressing both sentence alignment and neural machine translation tasks, obtaining results that are competitive against state-of-the-art models.
Relatori:	Luca Cagliero, Moreno La Quatra
Anno accademico:	2020/21
Tipo di pubblicazione:	Elettronica
Numero di pagine:	104
Soggetti:
Corso di laurea:	Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering)
Classe di laurea:	Nuovo ordinamento > Laurea magistrale > LM-32 - INGEGNERIA INFORMATICA
Aziende collaboratrici:	NON SPECIFICATO
URI:	http://webthesis.biblio.polito.it/id/eprint/19176

Modifica (riservato agli operatori)