Politecnico di Torino (logo)

Differentiable Working Memory

Younes Bouhadjar

Differentiable Working Memory.

Rel. Candido Pirri. Politecnico di Torino, Corso di laurea magistrale in Nanotechnologies For Icts (Nanotecnologie Per Le Ict), 2018

PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (3MB) | Preview

Artificial Intelligence has recently shown great success at language translation, computer vision and many other sensory perception tasks. However, it still requires further improvements to address problems that involve higher order cognitive behaviors, such as reasoning. The human brain relies on multiple memory systems for intelligent behavior. Working memory is an essential component for high order cognitive tasks ranging from language, planning and reasoning to decision making. In this thesis, I introduce a new model called Differentiable Working Memory (DWM), which emulates the human working memory. As it shows the same functional characteristics as working memory, the model robustly learns psychology-inspired tasks and converges faster than comparable state-of-the-art models. Moreover, the DWM model successfully generalizes to sequences two orders of magnitude longer than the ones used in training. Our in-depth analysis shows that the behavior of DWM is interpretable and that it learns to have fine control over memory, allowing it to retain, ignore or forget information based on its relevance. To facilitate running the DWM model against the different WM tasks and also running other models on the same set of tasks to establish baselines, we designed a framework called MI-Prometheus, that standardizes the interface that connects together the components needed in a machine learning system: problems, models architectures, and training/testing configurations. This work is a step towards building Artificial General Intelligence models, where different kinds of memory and reasoning centers are needed. We believe that the DWM model will be a critical part of such a cognitive architecture.

Relators: Candido Pirri
Academic year: 2018/19
Publication type: Electronic
Number of Pages: 50
Corso di laurea: Corso di laurea magistrale in Nanotechnologies For Icts (Nanotecnologie Per Le Ict)
Classe di laurea: New organization > Master science > LM-29 - ELECTRONIC ENGINEERING
Aziende collaboratrici: IBM
URI: http://webthesis.biblio.polito.it/id/eprint/18725
Modify record (reserved for operators) Modify record (reserved for operators)