Paolo Volpe
Efficient deploy of Streamwise‐StyleMelGAN on edge CPUs.
Rel. Enrico Magli. Politecnico di Torino, Corso di laurea magistrale in Communications And Computer Networks Engineering (Ingegneria Telematica E Delle Comunicazioni), 2022
Abstract
Deep learning has opened new opportunities in speech processing and speech coding. Neural vocoders outperform conventional approaches in terms of perceptual quality of reconstructed speech. However, approaches based on deep learning still suffer from complexity issues, which make the deployment on edge devices challenging. In this thesis we present the analysis and the optimizations performed on Streamwise-StyleMelGAN (SSMGAN), a neural vocoder able to synthesize high-quality wideband speech at 1.6 kbps. First, we present the ML compiler Apache TVM and assess the performance of standard (non-optimized) SSMGAN on ARM CPUs, showing how the baseline model is significantly slower than real-time on edge devices.
Then, quantization techniques are discussed, which are able to significantly reduce the memory footprint of the model
Relatori
Anno Accademico
Tipo di pubblicazione
Numero di pagine
Informazioni aggiuntive
Corso di laurea
Classe di laurea
Ente in cotutela
Aziende collaboratrici
URI
![]() |
Modifica (riservato agli operatori) |
