Emma, Anna, Safia Boulharts
Pruning ALBERT transformer for Analog-AI.
Rel. Carlo Ricciardi. Politecnico di Torino, Corso di laurea magistrale in Nanotechnologies For Icts (Nanotecnologie Per Le Ict), 2023
             
  | 
          
            
PDF (Tesi_di_laurea)
 - Tesi
   Licenza: Creative Commons Attribution Non-commercial No Derivatives. Download (8MB) | Preview  | 
          
| Abstract: | 
         Analog In-Memory Computing enables latency and energy consumption reduction on Deep Neural Network inference and training. The Analog-AI group developed a chip, ARES, capable of computing the Multiply-Accumulate (MAC) operation using Phase Change Memory devices. To demonstrate the performance of the chip, the ALBERT model, a more compact version of the widely known BERT transformer, is currently under experimental study. In this report, a general in-depth analysis of the contributions to the MAC is provided, revealing that some activation/weight pairs assume larger importance, while others can be safely pruned with very limited impact on accuracy. A new row-wise pruning strategy is proposed, followed by fine-tuning, which leads to reduced model size with equivalent accuracy. The proposed algorithm is then applied on the GLUE task using the ALBERT architecture, demonstrating simulated software- equivalent performance even with consistent weight pruning, potentially enabling several improvements such as reduction of required hardware tiles, superior power performance and simpler model on-chip deployment.  | 
    
|---|---|
| Relatori: | Carlo Ricciardi | 
| Anno accademico: | 2023/24 | 
| Tipo di pubblicazione: | Elettronica | 
| Numero di pagine: | 55 | 
| Soggetti: | |
| Corso di laurea: | Corso di laurea magistrale in Nanotechnologies For Icts (Nanotecnologie Per Le Ict) | 
| Classe di laurea: | Nuovo ordinamento > Laurea magistrale > LM-29 - INGEGNERIA ELETTRONICA | 
| Aziende collaboratrici: | NON SPECIFICATO | 
| URI: | http://webthesis.biblio.polito.it/id/eprint/28592 | 
![]()  | 
        Modifica (riservato agli operatori) | 
      


Licenza Creative Commons - Attribuzione 3.0 Italia