Elia Fontana
Fine-tuning Deep Language Models for Zero-Shot Text Classification.
Rel. Paolo Garza, Lorenzo Bongiovanni. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2023
|
PDF (Tesi_di_laurea)
- Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives. Download (1MB) | Preview |
Abstract: |
High quality Zero-Shot Text Classification is one of the holy grails of NLP as it allows to avoid the difficult, time-consuming and expensive process of collecting and labelling data for supervised training. Deep language models have shown remarkable capabilities in various natural language processing tasks, but their effectiveness in Zero-Shot Text Classification remains an area of exploration. Surely, large language model (LLMs), e.g., GPT4 and LaMDA, have undoubtedly shown stunning generalization capabilities but they are not open-source and anyway intractable with normal computing resources. The aim of this thesis is to go deeper and analyze this task, in the context of tractable, open-source language models. In particular, we focus on MPNet, a language model pre-trained on extensive general corpora and specialized on the task of Semantic Text Similarity (STS). We explore the advantages of implementing a supervised contrastive learning objective during the fine-tuning phase to address the challenge of Zero-Shot Text Classification. The main focus of this work is centred on enhancing the model’s Zero-Shot capability by generating a better-suited vector-based representation for short sentences like noun phrases, used as labels. Given a document, such as scientific paper or journal article, consisting of a title and a description, noun phrases are extracted from title. The framework aim is to generate embeddings for this short-text keywords in such a way that they are as close as possible in the semantic vector space to the embedding of the associated long-text description. Furthermore, an analysis of alignment and distribution uniformity within these generated vectors is conducted to gain a deeper understanding of the semantic vector space generated by MPNet during fine-tuning. By shedding light on these aspects, this thesis contributes to a deeper understanding of Zero-Shot Text Classification and presents novel insights that may pave the way in enhancing the performance and capabilities of deep language models in the context of Zero-Shot Text Classification. |
---|---|
Relators: | Paolo Garza, Lorenzo Bongiovanni |
Academic year: | 2023/24 |
Publication type: | Electronic |
Number of Pages: | 61 |
Subjects: | |
Corso di laurea: | Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering) |
Classe di laurea: | New organization > Master science > LM-32 - COMPUTER SYSTEMS ENGINEERING |
Aziende collaboratrici: | FONDAZIONE LINKS |
URI: | http://webthesis.biblio.polito.it/id/eprint/29355 |
Modify record (reserved for operators) |