Giuseppe Gallipoli
Text Style Transfer: a Cycle-Consistent Adversarial Approach.
Rel. Luca Cagliero, Moreno La Quatra. Politecnico di Torino, Corso di laurea magistrale in Data Science And Engineering, 2022
Abstract: |
Following the recent advancements in Natural Language Processing, an increasing number of tasks are gaining importance and new downstream applications are being designed. As an example, in the field of content moderation, it would be desirable to have a system capable of transforming comments written using offensive language into non-offensive versions whereas, considering intelligent writing assistants, their purpose is to help users to improve the quality of their writings. Both use cases are two possible subtasks of Text Style Transfer (TST), whose objective is to convert an input sentence carrying a certain source style (e.g., offensive) into its counterpart in the desired target attribute (e.g., non-offensive). The main challenge is represented by content preservation, which requires to rewrite the input text in the target style without modifying its original meaning. The goal of this master's thesis is to design a style transfer model relying on the well-known CycleGAN architecture which, although it was firstly introduced in Computer Vision, can be successfully adapted to process textual data. By leveraging the adversarial training framework, the network learns how to convert the style of source sentences by generating sequences resembling the real data. Moreover, the introduction of an additional cycle-consistent objective creates a cyclical relationship between the two attribute transformation directions which constrains the generation process allowing to preserve the content of documents. This is particularly useful when parallel data is not accessible since it provides an alternative form of supervision to guide the training of the model. Although it was originally devised to be applied in a fully unsupervised setting, when parallel pairs are available, the CycleGAN architecture can be extended by defining a supplementary supervised objective which encourages the network to produce sentences matching the target sequences. Differently from previous works which typically train the two style transfer models in isolation, the proposed method jointly learns both mapping functions in an end-to-end fashion. Furthermore, it takes advantage of state-of-the-art Transformer-based models to implement the underlying components, providing an improvement over the existing solutions based on recurrent networks. The developed approach has been employed to address two common TST subtasks: sentiment transfer, whose purpose is to modify the opinion expressed by a sentence from negative to positive and vice versa, and formality transfer, which requires to transform an informal text into its corresponding formal version and vice versa. The performance of the implemented method has been evaluated on three benchmark datasets and compared with several alternative solutions: in most cases, the technique introduced in this thesis outperforms previous works and improves state-of-the-art results, thus confirming the effectiveness of the proposed approach. |
---|---|
Relators: | Luca Cagliero, Moreno La Quatra |
Academic year: | 2022/23 |
Publication type: | Electronic |
Number of Pages: | 168 |
Additional Information: | Tesi secretata. Fulltext non presente |
Subjects: | |
Corso di laurea: | Corso di laurea magistrale in Data Science And Engineering |
Classe di laurea: | New organization > Master science > LM-32 - COMPUTER SYSTEMS ENGINEERING |
Aziende collaboratrici: | UNSPECIFIED |
URI: | http://webthesis.biblio.polito.it/id/eprint/24474 |
Modify record (reserved for operators) |