Emanuel Cascione
Performance of Deep Neural Networks for Sound Event Localization and Detection in Varying Noise Conditions.
Rel. Luciano Lavagno, Mihai Teodor Lazarescu. Politecnico di Torino, Corso di laurea magistrale in Mechatronic Engineering (Ingegneria Meccatronica), 2023
Abstract
This thesis investigates sound event localization and detection (SELD), which is an emerging research topic in audio signal processing and machine learning. SELD aims to identify and localize multiple overlapping sounds in an acoustic environment, which has many important applications in audio surveillance, robotic auditory systems, and human-machine interaction. The main challenges of SELD are reverberation, noise, and source variability. The goal of this thesis is to compare different machine learning algorithms and architectures that can solve SELD, focusing on deep neural networks (DNNs). The research questions are: how do different types of DNNs perform on SELD? What are the pros and cons of each type of DNN? How do DNNs perform under different noisy conditions? The methodology consists of training and testing DNN models on the same dataset, which contains recordings from first-order ambisonics microphones.
Each audio track contains a maximum of 2 overlapping active sound sources at a time
Relatori
Anno Accademico
Tipo di pubblicazione
Numero di pagine
Informazioni aggiuntive
Corso di laurea
Classe di laurea
URI
![]() |
Modifica (riservato agli operatori) |
