Politecnico di Torino (logo)

Deep neural networks for language recognition

Axel Baron

Deep neural networks for language recognition.

Rel. Sandro Cumani. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2023

PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (2MB) | Preview

This work belongs in the field of Language Recognition, which ultimately aims to deter- mine the language of a given speech sample. This work mostly focuses on the study of acoustic feature extraction and of Deep neural network technologies in order to solve language recognition problems. In particular, the Log-mel extraction technique and the ECAPA-TDNN model are the main technologies that drew my attention and motivated this work. This document begins by introducing the concepts and history of Language Recog- nition. Then I explain what acoustic features are, their purpose and the specific case of log-mel features. To follow there is a report about Neural Networks which slides towards the more complex case of Deep neural network and the case of ECAPA-TDNN model. In the end, there is my experimental setup, the decisions I made to treat this subject as well as the analysis of my results.

Relators: Sandro Cumani
Academic year: 2022/23
Publication type: Electronic
Number of Pages: 49
Corso di laurea: Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering)
Classe di laurea: New organization > Master science > LM-32 - COMPUTER SYSTEMS ENGINEERING
Aziende collaboratrici: UNSPECIFIED
URI: http://webthesis.biblio.polito.it/id/eprint/26761
Modify record (reserved for operators) Modify record (reserved for operators)