Design and analysis of VLSI architectures for Transformers

Davide Dura

Design and analysis of VLSI architectures for Transformers.

Rel. Maurizio Martina, Guido Masera, Alberto Marchisio. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Elettronica (Electronic Engineering), 2022

Preview

PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.
Download (1MB) | Preview

Abstract

Neural networks have been a big innovation field recently, with more and more applications addressing Machine Learning algorithms. A big part of these is made of Natural Language Processing (NLP) algorithms, which handle words, sentences and group of sentences. Machine translation, text generation, sentiment analysis and question and answering are just some examples of the NLP tasks. In this scope, the model that has gained more popularity is clearly the Transformer, with its great adaptability to different objectives. This network architecture is based on the attention mechanism and it has exceeded the performances of previously-used recurrent and convolutional neural networks. There are already several different models based on the Transformer: its encoder-decoder nature gives a lot of room to explore by changing the values of the parameters or the layer configuration.

BERT (Bidirectional Encoder Representations from Transformers) and Universal Transformer network are two particular models derived from the Transformer

Tipo di pubblicazione

Elettronica

URI

https://webthesis.biblio.polito.it/id/eprint/25517

Modifica (riservato agli operatori)