polito.it
Politecnico di Torino (logo)

Improving the Continual Learning Model VAG

Giuseppe Gabriele

Improving the Continual Learning Model VAG.

Rel. Stefano Di Carlo, Alessandro Savino, Alessio Carpegna. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2024

[img]
Preview
PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (1MB) | Preview
Abstract:

In recent years, there has been significant progress in developing computers and machines that can think like humans. This field of study is known as Continual Learning. One of the essential characteristics of human-like thinking is the capacity to recall past events and acquire new knowledge without forgetting what was formerly learned. This last aspect is crucial since one of the most significant challenges with continual learning is Catastrophic Forgetting. The process of learning new things can cause the model to forget the past, which is a significant issue that needs to be addressed. Various solutions exist for multi-task environments, including Vocabulary-Aware Label Generation (VAG). The VAG model will be improved using a multi-label approach with more instances for each dataset and mixing different techniques to avoid Catastrophic Forgetting, thereby increasing accuracy. The initial attempts to improve the VAG involved modifying the class labels in the used datasets (BANKING77 and CLINC150, both public and used by the original VAG model too), but this did not yield positive results. Eventually, better results were achieved by adding labels supporting each sentence, rather than modifying them. This resulted in having more than one label for each input sentence and led to improved accuracy compared to the original VAG. A new and improved sentence transformer will be used to increase accuracy further. Finally, a combination of the VAG and the Elastic-Weight Combination (EWC) will be demonstrated, resulting in the best accuracy presented in this project.

Relatori: Stefano Di Carlo, Alessandro Savino, Alessio Carpegna
Anno accademico: 2023/24
Tipo di pubblicazione: Elettronica
Numero di pagine: 60
Soggetti:
Corso di laurea: Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering)
Classe di laurea: Nuovo ordinamento > Laurea magistrale > LM-32 - INGEGNERIA INFORMATICA
Ente in cotutela: UNIVERSITY OF ILLINOIS AT CHICAGO (STATI UNITI D'AMERICA)
Aziende collaboratrici: NON SPECIFICATO
URI: http://webthesis.biblio.polito.it/id/eprint/31716
Modifica (riservato agli operatori) Modifica (riservato agli operatori)