Simone Sasso
Automated creation of Podcasts empowered by Text-To-Speech.
Rel. Antonio Vetro', Giovanni Garifo. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2022
|
Preview |
PDF (Tesi_di_laurea)
- Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives. Download (5MB) | Preview |
Abstract
The goal of Text-to-Speech (TTS) is to synthesize human-like speech from texts. Over the last decade, this research field has seen incredible improvements, thanks to the significant advances in deep learning and its extensive development. TTS models based on neural networks have been able to achieve results that are almost indistinguishable from human speech. Consequently, this technology has become more and more popular, drastically improving the way people interact with machines. Despite its current progress, neural TTS is far from a solved problem and still presents several criticalities. Both training and inference require heavy computational resources, and models tend to make mistakes when dealing with corner cases or text which belongs to a different domain with respect to the training set.
This thesis will examine the development of a pipeline for the generation of podcasts, by using a Text-to-Speech model to read news articles
Tipo di pubblicazione
URI
![]() |
Modifica (riservato agli operatori) |
