Data Driven: AI Voice Cloning

Alessandro Emmanuel Pecora

Data Driven: AI Voice Cloning.

Rel. Luca Cagliero, Moreno La Quatra, Lorenzo Vaiani. Politecnico di Torino, Corso di laurea magistrale in Data Science And Engineering, 2023

Preview

PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.
Download (2MB) | Preview

Abstract

As humans, we transmit a significant amount of information through speech. Evolution has developed an entire organ with the function of modulating audio signals for communication purposes, and speech is the most commonly used communication channel among humans. In the field of speech processing, there are several transformations that can be used to extract values from speech data. These applications range from clinical settings, such as detecting Parkinson's disease from voice samples, to the media industry, where software for automatic dubbing in multiple languages can be developed using speech processing methods. This thesis focuses on two specific tasks within the field of speech processing: speaker recognition (SR) and text-to-speech synthesis (TTS).

Speaker recognition involves determining an individual's identity through their voice, while text-to-speech synthesis entails creating natural-sounding human speech waveforms from provided input text

Relatori

Luca Cagliero, Moreno La Quatra, Lorenzo Vaiani

Anno Accademico

2022/23

Tipo di pubblicazione

Elettronica

Numero di pagine

Corso di laurea

Corso di laurea magistrale in Data Science And Engineering

Classe di laurea

Nuovo ordinamento > Laurea magistrale > LM-32 - INGEGNERIA INFORMATICA

URI

https://webthesis.biblio.polito.it/id/eprint/27738

Modifica (riservato agli operatori)