Politecnico di Torino (logo)

Application of Transformers to edge-computing in ultra-low power devices

Francesco Bianco Morghet

Application of Transformers to edge-computing in ultra-low power devices.

Rel. Daniele Jahier Pagliari, Alessio Burrello. Politecnico di Torino, Corso di laurea magistrale in Data Science and Engineering, 2021

PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (6MB) | Preview

Low-power edge devices and IoT sensors are employed in many different tasks that benefit from machine learning techniques. However, the high resource requirements in terms of computing power, memory footprint and energy consumption make the deployment of Deep Learning models at the edge very challenging. In particular, an emerging class of deep learning models, the Transformers, has obtained state-of-the-art results in fields such as natural language processing (NLP) and computer vision (CV). On the other hand, typical Transformer models contain millions or billions of parameters, and perform billions of operations, which is unsuitable for execution on edge devices. The effectiveness of smaller-scale Transformers, instead, is largely unstudied. This thesis focuses on applying transformers to hand movement classification based on surface electromyographic (sEMG) signals, a latency-sensitive application that cannot rely on cloud inference, and therefore must be executed on low-power edge devices. It is shown that Transformers can attain nearly the same accuracy of previous state-of-the-art architectures, with a complexity reduction of up to 5x in terms of memory footprint and number of multiply-accumulate operations. In particular, the number of parameters of the proposed models is in the range of 100k to 500k, which is several orders of magnitude lower than most state-of-the-art Transformers for other applications. The thesis also explores another common practice in Transformers' literature, the use of pre-training, showing that fine-tuning a pre-trained model can improve accuracy even in the highly-compressed network architectures presented in this work. Accuracy improvements of 1-3\% are observed on average. As a last step towards optimizing a Transformer for edge deployment, a Neural Architecture Search (NAS) is applied to substitute some of the self-attention layers in the network with simpler convolutions, without impairing the final accuracy.

Relators: Daniele Jahier Pagliari, Alessio Burrello
Academic year: 2021/22
Publication type: Electronic
Number of Pages: 83
Corso di laurea: Corso di laurea magistrale in Data Science and Engineering
Classe di laurea: New organization > Master science > LM-32 - COMPUTER SYSTEMS ENGINEERING
Aziende collaboratrici: UNSPECIFIED
URI: http://webthesis.biblio.polito.it/id/eprint/21217
Modify record (reserved for operators) Modify record (reserved for operators)