Andrea Costamagna
Equilibrium Propagation for Recurrent Neural Networks based on Resistive Switching Devices: from circuit implementation to supervised machine learning.
Rel. Fernando Corinto, Carlo Ricciardi. Politecnico di Torino, Corso di laurea magistrale in Nanotechnologies For Icts (Nanotecnologie Per Le Ict), 2021
|
PDF (Tesi_di_laurea)
- Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives. Download (11MB) | Preview |
Abstract: |
The human brain is a biological machine capable of performing real time computing tasks with an extremely reduced power budget. For this reason, brain inspired computing aims at mimicking the brain functioning to design new paradigms of computation. This requires to concurrently work at the hardware and at the software level, toward the optimization of their interplay. On the software side, a well established neuronal model is the Additive Model, that treats the brain as a non-linear dynamical system evolving under the influence of external stimuli. This model is classifiable as a continuous time Recurrent Neural Network (RNN). Recently, Bengio and Scellier have designed a learning algorithm named Equilbrium Propagation (EP) capable of efficiently performing learning on this model. On the hardware side, the capability of ReRAM devices to store in a compact component multiple states of conductance has been explored for reproducing in hardware the synapses, i.e. the biological units at the basis of learning. However, these devices present many challenges in terms of retention and of linearity. Additionally, RNNs can be mapped into purely analogue circuits that, if implemented, would dramatically improve their computing time. This work aims at exploring the interplay of software and hardware toward the definition and simulation of a ReRAM-based network for the analogue implementation of the Additive Model. In the first part the versatility of Equilibrium Propagation is tested on the solution of two machine learning tasks: reconstruction of corrupted images and classification of images. While discussing reconstruction it is verified the possibility to use EP for the optimization of a wide range of networked structures. Additionally, this work proposes a modified version of the EP algorithm, together with a preliminary result of its usage. For what concerns classification, the EP algorithm is applied to increasingly complex classification tasks. Finally, it is discussed a Feature Engineering approach inspired by the learning mechanisms of the Additive model. This allows to considerably reduce the number of synaptic connections needed while guaranteeing satisfactory performances. In all the problems considered the non-linearity is introduced using two functions: the hard sigmoid or the hard hyperbolic tangent. Both of them are proved to be effective in the solution of all the investigated tasks while being compactly implementable in hardware. The second part of this work aims at developing ngSpice models to simulate the analogue circuit associated to the RNNs previously discussed in software. The starting point is a behavioral definition of the primitives to be connected in the network. Then, the description level is progressively lowered toward a full hardware description. In its final form, the proposed neuron is made of two operational amplifiers, three resistors, one capacitor and two Schottky diodes. This solution is particularly promising since this neuron is designed to limit the retention problem and the non-linearity of ReRAMs. In the last part of this work one additional non-ideality of the ReRAMs is considered: the cycle to cycle variability. This feature prevents from achieving the performances that can be simulated in digital electronics and forces to work with quantized states of conductance. When extending EP to quantized networks some precautions need to be taken and the classification task is used for discussing them. |
---|---|
Relatori: | Fernando Corinto, Carlo Ricciardi |
Anno accademico: | 2020/21 |
Tipo di pubblicazione: | Elettronica |
Numero di pagine: | 148 |
Soggetti: | |
Corso di laurea: | Corso di laurea magistrale in Nanotechnologies For Icts (Nanotecnologie Per Le Ict) |
Classe di laurea: | Nuovo ordinamento > Laurea magistrale > LM-29 - INGEGNERIA ELETTRONICA |
Aziende collaboratrici: | Politecnico di Torino |
URI: | http://webthesis.biblio.polito.it/id/eprint/19696 |
Modifica (riservato agli operatori) |