Data Driven: AI Voice Cloning
Alessandro Emmanuel Pecora
Data Driven: AI Voice Cloning.
Rel. Luca Cagliero, Moreno La Quatra, Lorenzo Vaiani. Politecnico di Torino, Corso di laurea magistrale in Data Science And Engineering, 2023
|
Preview |
PDF (Tesi_di_laurea)
- Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives. Download (2MB) | Preview |
Abstract
As humans, we transmit a significant amount of information through speech. Evolution has developed an entire organ with the function of modulating audio signals for communication purposes, and speech is the most commonly used communication channel among humans. In the field of speech processing, there are several transformations that can be used to extract values from speech data. These applications range from clinical settings, such as detecting Parkinson's disease from voice samples, to the media industry, where software for automatic dubbing in multiple languages can be developed using speech processing methods. This thesis focuses on two specific tasks within the field of speech processing: speaker recognition (SR) and text-to-speech synthesis (TTS).
Speaker recognition involves determining an individual's identity through their voice, while text-to-speech synthesis entails creating natural-sounding human speech waveforms from provided input text
Tipo di pubblicazione
URI
![]() |
Modifica (riservato agli operatori) |
