Enrico Capuano
Enhancing Embedding Models through Specialized Finetuning in the banking sector.
Rel. Daniele Apiletti, Claudia Berloco. Politecnico di Torino, Corso di laurea magistrale in Data Science And Engineering, 2024
Abstract
The rapid advancement of Generative AI and Natural Language Processing has led to the widespread adoption of embedding models in various applications, including question-answering systems. These systems rely on the representation of words and sentences through embeddings to retrieve relevant information. However, open source pre-trained multipurpose embedding models may not capture specific nuances in certain contexts, such as the banking sector. This study investigates the benefits of fine-tuning pre-trained embedding models on a dedicated dataset to improve their performance in specific contexts. In details, we build a proprietary dataset from the banking sector using proprietary documents. We split the dataset into train and test sets.
The first is used to test different pretrained open-source multipurpose embedding models and the second to get a fine-tuning
Relatori
Anno Accademico
Tipo di pubblicazione
Numero di pagine
Informazioni aggiuntive
Corso di laurea
Classe di laurea
Aziende collaboratrici
URI
![]() |
Modifica (riservato agli operatori) |
