Francesco Falconieri
Open-Source Design of a Bitline-Computing-SRAM for Neural Networks Acceleration at the Edge.
Rel. Mario Roberto Casu. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Elettronica (Electronic Engineering), 2022
|
PDF (Tesi_di_laurea)
- Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives. Download (44MB) | Preview |
Abstract: |
The vast majority of state-of-the-art computing platforms are based on the von-Neumann Architecture, which consists of two different spatial dwellings for handled data: the main computing unit (i.e., the CPU) and the data storage unit (i.e., the memory). This approach allowed for continuous performance improvements in the last decades but began to reach a dead-end as soon as datacentric applications started to become pervasive, like the ones involved, for instance, in artificial intelligence, cryptography and signal processing, due to the well known communication cost and von-Neumann bottleneck. This problem led to questioning whether this approach could still be pursued, hence the concept of In-Memory Computing (IMC) began to attract interest. The IMC paradigm aims to relocate the computing center from the CPU to the memory units (e.g., SRAMs or caches) for datacentric applications, in order to reduce the number of memory accesses. IMC solutions based on the CMOS technology can be classified into two main categories: analog and digital implementations. Analog IMC implementations allow massive parallelism but yield approximated results due to their analog nature subject to process-variations and mismatches. This limits their applications to error resilient ones like neural networks. Digital IMC implementations, unlike their analog counterparts, always yield 100% accurate results, expanding their applications in a larger number of fields, like for instance, cryptography, at the expense of a reduced parallelism. Several SRAM or cache memory architectures have been proposed, which embed logic gates in the memory cells, implement sequential peripheral circuitry for complex computations, and exploit bitline computation in a wired-or fashion. This approach works flawlessly especially with Binary Neural Networks (BNN), requiring a single bit for both activations and weights, hence reducing their product to a simple XNOR. Due to the ever growing interest in deploying neural network models in edge devices, the popularity of the IMC concept is expected to increase, but the fact that the design of IMC devices is challenging, might actually hinder its practical adoption. Following the OpenROAD philosophy of "democratizing hardware design", an open-source flow will be used to design a Bitline-Computing-SRAM based on a customized version of OpenRAM memory compiler, in order to take a step forward in the direction of making viable the adoption of IMC accelerators. Toward this goal, in this thesis the concept of pulsed wordlines bitline computing will be resumed and expanded in order to solve the read-stability problem of simultaneous activation in 6T bitcell array. A computing sense-amplifier array will be introduced which suits the acceleration of BNN and AdderNets and a new circuit topology for a single ended sense-amplifier will be proposed to enhance the system’s performance as well as limiting the area overhead and provide a variation-tolerant design through pulse-width resilient behaviour. The work targets the SkyWater's SKY130 180nm-130nm process open-source PDK, which represents a sweet spot for commercial IoT applications. |
---|---|
Relatori: | Mario Roberto Casu |
Anno accademico: | 2021/22 |
Tipo di pubblicazione: | Elettronica |
Numero di pagine: | 158 |
Soggetti: | |
Corso di laurea: | Corso di laurea magistrale in Ingegneria Elettronica (Electronic Engineering) |
Classe di laurea: | Nuovo ordinamento > Laurea magistrale > LM-29 - INGEGNERIA ELETTRONICA |
Aziende collaboratrici: | NON SPECIFICATO |
URI: | http://webthesis.biblio.polito.it/id/eprint/22833 |
Modifica (riservato agli operatori) |