Simone Clemente
Cognitive Aware Incremental Knowledge Update of Large Language Models.
Rel. Marco Mellia. Politecnico di Torino, Corso di laurea magistrale in Data Science And Engineering, 2024
|
PDF (Tesi_di_laurea)
- Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives. Download (2MB) | Preview |
Abstract: |
Despite remarkable capabilities, Large language models (LLMs) struggle with incrementally updating knowledge without catastrophic forgetting or indiscriminate learning. In contrast, humans effortlessly integrate new information, detect conflicts with existing beliefs, and selectively update their knowledge. This work introduces a novel paradigm inspired by human brain: Cognitive Aware Incremental Knowledge Update. We implement and evaluate two key components within existing LLM architectures: (1) Inner State Awareness, allowing LLMs to classify new information as novel, familiar, or conflicting; and (2) targeted updates through Differentiated Plasticity, distinguishing between neurons containing previous knowledge (busy) and rarely used neurons (free). Through a series of controlled experiments, we demonstrate the potential benefits of this approach, including improved preservation of prior knowledge during updates, more effective handling of conflicting information, and enhanced ability to target specific knowledge for updates. While challenges remain, particularly in scaling to full-size LLMs and real-world scenarios, our work provides a promising direction for developing more flexible and adaptable language models. In this study, we present a detailed overview of the proposed method, supported by a comprehensive review of the existing literature, a complete description of the experiments, and an in-depth analysis of our findings. |
---|---|
Relators: | Marco Mellia |
Academic year: | 2024/25 |
Publication type: | Electronic |
Number of Pages: | 79 |
Subjects: | |
Corso di laurea: | Corso di laurea magistrale in Data Science And Engineering |
Classe di laurea: | New organization > Master science > LM-32 - COMPUTER SYSTEMS ENGINEERING |
Ente in cotutela: | KUNGLIGA TEKNISKA HOGSKOLAN (ROYAL INSTITUTE OF TECHNOLOGY) - EECS (SVEZIA) |
Aziende collaboratrici: | Huawei Technologies France S.A.S.U |
URI: | http://webthesis.biblio.polito.it/id/eprint/33206 |
Modify record (reserved for operators) |