polito.it
Politecnico di Torino (logo)

HERMIONE: a Co-design Methodology for Layer-wise Approximation of Neural Networks on RISC-V

Flavia Guella

HERMIONE: a Co-design Methodology for Layer-wise Approximation of Neural Networks on RISC-V.

Rel. Maurizio Martina, Guido Masera, Emanuele Valpreda, Michele Caon. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Elettronica (Electronic Engineering), 2023

Abstract:

Convolutional Neural Networks (CNNs) are nowadays ubiquitous thanks to their remarkable performance in a wide spectrum of tasks, from computer vision to speech recognition. However, high task accuracy generally comes at the cost of increased power consumption, which is an incompatible feature with respect to the rising trend in moving computation towards the edge. In this context, PULP, an open-source microcontroller featuring an 8-core cluster based on RISC-V Instruction Set Architecture (ISA), is chosen as the target platform of this work. It provides a good trade-off between reconfigurability, required by the heterogeneity of applications and precluded to custom accelerators, and power optimization. HERMIONE (Highly Efficient Reconfigurable Multiplier for Inference of apprOximate neural Networks at the Edge) further improves energy efficiency by exploiting the inherent error-resilience of CNNs. It provides novel support for layer-wise approximate computing and mixed-precision quantization. An integer signed multiplier, with different run-time configurable approximation levels and run-time selection of the operands precision, is designed and added to the RISC-V architecture, together with a Control Status Register to manage the configuration of the multiplier. The selection of the approximation level for each convolutional layer of a trained and quantized network relies on a multi-objective genetic algorithm, NSGA-II, searching for Pareto optimal solutions in terms of dynamic power and task accuracy. Results are collected on a custom CNN for MNIST dataset; inference is performed with the maximum precision allowed by the multiplier and only the approximation level is tuned. When an exact multiplier, with the same input bit-width, optimized by Synopsys Design Ware, is used as baseline, up to 48% of power can be saved with no accuracy degradation, and up to 54% with a decay lower than 5%. These results are very promising and demonstrate the effectiveness of approximation as a low-power technique in CNNs domain.

Relatori: Maurizio Martina, Guido Masera, Emanuele Valpreda, Michele Caon
Anno accademico: 2022/23
Tipo di pubblicazione: Elettronica
Numero di pagine: 139
Informazioni aggiuntive: Tesi secretata. Fulltext non presente
Soggetti:
Corso di laurea: Corso di laurea magistrale in Ingegneria Elettronica (Electronic Engineering)
Classe di laurea: Nuovo ordinamento > Laurea magistrale > LM-29 - INGEGNERIA ELETTRONICA
Aziende collaboratrici: NON SPECIFICATO
URI: http://webthesis.biblio.polito.it/id/eprint/26864
Modifica (riservato agli operatori) Modifica (riservato agli operatori)