Modeling and classifying textual data through Transformer-based architecture: a comparative approach in Natural Language Processing

Valentina Margiotta

Modeling and classifying textual data through Transformer-based architecture: a comparative approach in Natural Language Processing.

Rel. Tania Cerquitelli, Alessio Bosca, Gianpiero Sportelli. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2021

Preview

PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.
Download (8MB) | Preview

Abstract:	In the scenario of deep learning models applied to Natural Language Processing (NLP), the Transformer architecture has brought great interest thanks to its attention mechanism, which helps models to focus on specific parts considering the relationship between words, independently of where they are placed in a sentence. Many models based on this type of architecture have been developed by the research communities. In my Master Thesis I examined the following language models: • Bidirectional Encoder Representations from Transformers (BERT) • Efficiently Learning an Encoder that Classifies Token Replacement Accurately (ELECTRA) • Generative Pre-Trained 2 (GPT-2) These models have been applied to the NLP text classification task on different datasets both in English and in Italian in order to evaluate and compare the performances obtained with the models pre-trained on an English corpus rather than with a multilingual model. Specifically, two types of classification have been considered in this Master Thesis, multi-class and multi-label. They differ in that in a multi-label classification problem, the training set is composed of sentences, each of which can be assigned to multiple categories instead of a single label. In this context, therefore, the above models have been trained and fine-tuned with the aim of assessing their reliability and accuracy in predicting the correct labels associated with the text given as input to the models. Following the various experiments carried out, it was observed that all the models used were able to correctly predict with high accuracy the label associated with a textual input and it was possible to carry out a textual classification task in a short time thanks to the architecture of the transformers, which are models already intensely pre-trained on a large corpus of data and therefore they can be fine-tuned in an inexpensive way on numerous downstream NLP tasks.
Relatori:	Tania Cerquitelli, Alessio Bosca, Gianpiero Sportelli
Anno accademico:	2021/22
Tipo di pubblicazione:	Elettronica
Numero di pagine:	84
Soggetti:
Corso di laurea:	Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering)
Classe di laurea:	Nuovo ordinamento > Laurea magistrale > LM-32 - INGEGNERIA INFORMATICA
Aziende collaboratrici:	Celi srl
URI:	http://webthesis.biblio.polito.it/id/eprint/20574

Modifica (riservato agli operatori)