Fine-tuning Deep Language Models for Zero-Shot Text Classification

Elia Fontana

Fine-tuning Deep Language Models for Zero-Shot Text Classification.

Rel. Paolo Garza, Lorenzo Bongiovanni. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2023

Preview

PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.
Download (1MB) | Preview

Abstract

High quality Zero-Shot Text Classification is one of the holy grails of NLP as it allows to avoid the difficult, time-consuming and expensive process of collecting and labelling data for supervised training. Deep language models have shown remarkable capabilities in various natural language processing tasks, but their effectiveness in Zero-Shot Text Classification remains an area of exploration. Surely, large language model (LLMs), e.g., GPT4 and LaMDA, have undoubtedly shown stunning generalization capabilities but they are not open-source and anyway intractable with normal computing resources. The aim of this thesis is to go deeper and analyze this task, in the context of tractable, open-source language models.

In particular, we focus on MPNet, a language model pre-trained on extensive general corpora and specialized on the task of Semantic Text Similarity (STS)