polito.it
Politecnico di Torino (logo)

Efficient deploy of Streamwise‐StyleMelGAN on edge CPUs

Paolo Volpe

Efficient deploy of Streamwise‐StyleMelGAN on edge CPUs.

Rel. Enrico Magli. Politecnico di Torino, Corso di laurea magistrale in Communications And Computer Networks Engineering (Ingegneria Telematica E Delle Comunicazioni), 2022

Abstract:

Deep learning has opened new opportunities in speech processing and speech coding. Neural vocoders outperform conventional approaches in terms of perceptual quality of reconstructed speech. However, approaches based on deep learning still suffer from complexity issues, which make the deployment on edge devices challenging. In this thesis we present the analysis and the optimizations performed on Streamwise-StyleMelGAN (SSMGAN), a neural vocoder able to synthesize high-quality wideband speech at 1.6 kbps. First, we present the ML compiler Apache TVM and assess the performance of standard (non-optimized) SSMGAN on ARM CPUs, showing how the baseline model is significantly slower than real-time on edge devices. Then, quantization techniques are discussed, which are able to significantly reduce the memory footprint of the model. Finally, we introduce depthwise-separable convolutions and other low-rank approximations, which can drastically reduce the complexity of the baseline model. These techniques significantly speed up inference with marginal effects on the final quality of the speech, making the model faster than real-time on a single core of ARM Cortex-A57.

Relators: Enrico Magli
Academic year: 2022/23
Publication type: Electronic
Number of Pages: 49
Additional Information: Tesi secretata. Fulltext non presente
Subjects:
Corso di laurea: Corso di laurea magistrale in Communications And Computer Networks Engineering (Ingegneria Telematica E Delle Comunicazioni)
Classe di laurea: New organization > Master science > LM-27 - TELECOMMUNICATIONS ENGINEERING
Ente in cotutela: INSTITUT EURECOM (FRANCIA)
Aziende collaboratrici: Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e. V.
URI: http://webthesis.biblio.polito.it/id/eprint/24498
Modify record (reserved for operators) Modify record (reserved for operators)