polito.it
Politecnico di Torino (logo)

Text generation through a GPT based GAN model.

Gianluca La Malfa

Text generation through a GPT based GAN model.

Rel. Luca Cagliero, Moreno La Quatra. Politecnico di Torino, Corso di laurea magistrale in Data Science And Engineering, 2023

[img]
Preview
PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (3MB) | Preview
Abstract:

Lack of data represents one of the main problems in the Machine Learning field. The greater part of the algorithms, used for different scopes and goals, needs a huge quantity of data to be trained on. This problem becomes even bigger in the Deep Learning field, because usually, neural networks require more data than classical machine learning algorithms. Strictly connected to this, there is also the problem of class imbalance that can arise for different kinds of situations, above all in classification tasks. With class imbalance, machine learning models will typically over-classify the larger class due to their increased prior probability. As a result, the instances belonging to the smaller class are typically misclassified more often than those belonging to the larger class. Even if there are already different approaches to solve the problem when working on tabular data, it is still difficult to solve the issue when working with natural language. Being able to generate completely new sentences, and more specifically to re-balance the distribution of data based on the minority class represents an important challenge to face. Generative models are one way of performing text generation. In particular with the introduction of Generative Adversarial Networks (GAN) a new way of approaching the generation problem is born. While in the computer vision field, GANs are already obtaining sensational results, the same does not apply for natural language processing. Although there are different models that are reaching significant results, there is still room for improvements for GANs to be competitive even in this field. The goal of this study is to try to understand if the GAN architecture can reach comparable performances to GPT,which represents the state of the art,in the text generation field. To do this the financial domain will be object of study, words that in natural language can assume one meaning can be completely different in finance. The goal is to show the robustness and versatility of the proposed models. The presented work is divided into several phases: a first phase of finding and creation of the datasets to work on, a second phase consisting of the implementation of a traditional GAN and comparison with the state of the art approach by analyzing a classification task built on top of the dataset re-balanced through text generation, a third and final phase of building a modified conditional GAN model by exploring the power of GPT as generator and comparison with the state of the art approach in the same settings. All the results are presented at the end of this work, showing the most important difference in terms of performance, robustness and completeness.

Relatori: Luca Cagliero, Moreno La Quatra
Anno accademico: 2022/23
Tipo di pubblicazione: Elettronica
Numero di pagine: 94
Soggetti:
Corso di laurea: Corso di laurea magistrale in Data Science And Engineering
Classe di laurea: Nuovo ordinamento > Laurea magistrale > LM-32 - INGEGNERIA INFORMATICA
Aziende collaboratrici: Blue Reply Srl
URI: http://webthesis.biblio.polito.it/id/eprint/26719
Modifica (riservato agli operatori) Modifica (riservato agli operatori)