Javier Jesus Poveda Rodrigo
Inference optimization of Large Language Models on RISC-V HPC platforms.
Rel. Daniele Jahier Pagliari, Mohamed Amine Hamdi, Alessio Burrello. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2024
|
Preview |
PDF (Tesi_di_laurea)
- Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives. Download (9MB) | Preview |
Abstract
Over the past decade, there have been significant improvements in Artificial Intelligence (AI), particularly in the area of natural language processing (NLP), thanks to the emergence of Transformers and, more in general, of Large Language Models (LLMs). These models have enabled numerous deep-learning applications such as translation, text generation, image generation, and many others. However, these transformer-based models present new challenges because of their computationally intensive attention mechanisms and extremely high memory footprint. Even though these types of workloads are typically offloaded to GPUs, there are applications and use cases that require CPU as the workhorse because of its reduced cost and greater flexibility.
For instance, while training is too burdensome for CPUs environments, CPUs are suitable for single-example or even batched inference
Relatori
Anno Accademico
Tipo di pubblicazione
Numero di pagine
Corso di laurea
Classe di laurea
URI
![]() |
Modifica (riservato agli operatori) |
