Modeling and classifying textual data through Transformer-based architecture: a comparative approach in Natural Language Processing

Valentina Margiotta

Modeling and classifying textual data through Transformer-based architecture: a comparative approach in Natural Language Processing.

Rel. Tania Cerquitelli, Alessio Bosca, Gianpiero Sportelli. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2021

Preview

PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.
Download (8MB) | Preview

Abstract

In the scenario of deep learning models applied to Natural Language Processing (NLP), the Transformer architecture has brought great interest thanks to its attention mechanism, which helps models to focus on specific parts considering the relationship between words, independently of where they are placed in a sentence. Many models based on this type of architecture have been developed by the research communities. In my Master Thesis I examined the following language models: • Bidirectional Encoder Representations from Transformers (BERT) • Efficiently Learning an Encoder that Classifies Token Replacement Accurately (ELECTRA) • Generative Pre-Trained 2 (GPT-2) These models have been applied to the NLP text classification task on different datasets both in English and in Italian in order to evaluate and compare the performances obtained with the models pre-trained on an English corpus rather than with a multilingual model.

Specifically, two types of classification have been considered in this Master Thesis, multi-class and multi-label