Saeedeh Javadi
Knowledge Editing in Large Language Model.
Rel. Paolo Garza. Politecnico di Torino, Corso di laurea magistrale in Data Science And Engineering, 2024
|
PDF (Tesi_di_laurea)
- Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives. Download (1MB) | Preview |
Abstract: |
The use of large language models (LLMs) as dynamic repositories of knowledge is becoming increasingly prevalent. However, these models face significant challenges in managing outdated, erroneous, or privacy-sensitive information. The capacity to edit knowledge in an expedient manner within these models, without recourse to costly retraining, has emerged as a pivotal area of investigation. The existing techniques for editing knowledge are, on the whole, effective; however, they frequently lack robustness, particularly when applied across multiple languages. This thesis explores the domain of multilingual knowledge editing using Multi Lingual models like Llama-2, with a particular focus on enhancing the models' ability to update their knowledge efficiently and accurately in a multilingual context. Our approach makes use of MEMIT (Mass-Editing Memory in Transformers), which enables the large-scale updating of the internal memory of transformer-based models. MEMIT enables the simultaneous editing of thousands of memories within LLMs, thereby providing a scalable solution for the correction of outdated or erroneous information. To further enhance this process, we integrate MEMIT with in-context learning (ICL), a technique that enables models to generalise knowledge from a few examples during inference. The objective is to integrate these two powerful methods in order to achieve precise and extensive knowledge updates across languages, thereby addressing one of the key challenges in multi-lingual LLMs. Furthermore, this thesis incorporates prompt engineering as a technique to enhance the accuracy of the model's behaviour following knowledge edits. By carefully designing prompts, we guide the model's responses to ensure that updated information is both accurate and contextually appropriate for the target language. This mitigates issues such as over-editing, where unintended changes affect related knowledge, and instability, where the model struggles to retain the edited information across tasks and languages. Furthermore, by exploring the transformer architecture in detail, we examine how knowledge flows through the model’s layers during the editing process. Empirical testing on Multi Lingual models like Llama-2 shows that our combined approach significantly improves the performance of multi-lingual knowledge editing tasks. We evaluate the models on several languages, demonstrating enhanced accuracy and consistency in their ability to update and retain information. |
---|---|
Relatori: | Paolo Garza |
Anno accademico: | 2024/25 |
Tipo di pubblicazione: | Elettronica |
Numero di pagine: | 79 |
Soggetti: | |
Corso di laurea: | Corso di laurea magistrale in Data Science And Engineering |
Classe di laurea: | Nuovo ordinamento > Laurea magistrale > LM-32 - INGEGNERIA INFORMATICA |
Aziende collaboratrici: | University College London |
URI: | http://webthesis.biblio.polito.it/id/eprint/33208 |
Modifica (riservato agli operatori) |