Paolo Volpe
Efficient deploy of Streamwise‐StyleMelGAN on edge CPUs.
Rel. Enrico Magli. Politecnico di Torino, Corso di laurea magistrale in Communications And Computer Networks Engineering (Ingegneria Telematica E Delle Comunicazioni), 2022
Abstract: |
Deep learning has opened new opportunities in speech processing and speech coding. Neural vocoders outperform conventional approaches in terms of perceptual quality of reconstructed speech. However, approaches based on deep learning still suffer from complexity issues, which make the deployment on edge devices challenging. In this thesis we present the analysis and the optimizations performed on Streamwise-StyleMelGAN (SSMGAN), a neural vocoder able to synthesize high-quality wideband speech at 1.6 kbps. First, we present the ML compiler Apache TVM and assess the performance of standard (non-optimized) SSMGAN on ARM CPUs, showing how the baseline model is significantly slower than real-time on edge devices. Then, quantization techniques are discussed, which are able to significantly reduce the memory footprint of the model. Finally, we introduce depthwise-separable convolutions and other low-rank approximations, which can drastically reduce the complexity of the baseline model. These techniques significantly speed up inference with marginal effects on the final quality of the speech, making the model faster than real-time on a single core of ARM Cortex-A57. |
---|---|
Relatori: | Enrico Magli |
Anno accademico: | 2022/23 |
Tipo di pubblicazione: | Elettronica |
Numero di pagine: | 49 |
Informazioni aggiuntive: | Tesi secretata. Fulltext non presente |
Soggetti: | |
Corso di laurea: | Corso di laurea magistrale in Communications And Computer Networks Engineering (Ingegneria Telematica E Delle Comunicazioni) |
Classe di laurea: | Nuovo ordinamento > Laurea magistrale > LM-27 - INGEGNERIA DELLE TELECOMUNICAZIONI |
Ente in cotutela: | INSTITUT EURECOM (FRANCIA) |
Aziende collaboratrici: | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e. V. |
URI: | http://webthesis.biblio.polito.it/id/eprint/24498 |
Modifica (riservato agli operatori) |