Deep Recommender Models Data Flow Optimization for AI Accelerators
Giuseppe Ruggeri
Deep Recommender Models Data Flow Optimization for AI Accelerators.
Rel. Daniele Jahier Pagliari. Politecnico di Torino, Master of science program in Computer Engineering, 2023
|
Preview |
PDF (Tesi_di_laurea)
- Thesis
Licence: Creative Commons Attribution Non-commercial No Derivatives. Download (5MB) | Preview |
Abstract
Deep Learning-based Recommender Models (DLRMs) have become indispensable tools for businesses to provide effective personalized recommendations to end users. As a result, the workload introduced by these models is extremely relevant, representing, for instance, more than 79% of the AI workload in Meta’s data centers. Therefore, the optimization of such models is crucial and can lead to big energy savings, as well as increased throughput and better real-time responsiveness. State-of-the-art DLRMs present big performance limitations due to embedding layers, which project sparse categorical features to dense, continuous embedding vectors. In particular, the bottleneck is given by the large number of random memory accesses performed to retrieve a multitude of small embedding vectors from look-up tables stored in off-chip memory.
To mitigate this issue, some existing approaches exploit the large bandwidth offered by High Bandwidth Memory (HBM), while others propose to build clusters of heterogeneous nodes exploiting the advantages introduced by each platform
Publication type
URI
![]() |
Modify record (reserved for operators) |
