Michele Panariello
Low-complexity neural networks for robust acoustic scene classification in wearable audio devices.
Rel. Antonio Servetti. Politecnico di Torino, Master of science program in Data Science And Engineering, 2022
|
Preview |
PDF (Tesi_di_laurea)
- Thesis
Licence: Creative Commons Attribution Non-commercial No Derivatives. Download (10MB) | Preview |
Abstract
This work concerns the design of a machine learning pipeline to perform acoustic scene classification (ASC) on a pair of headphones by means of a convolutional neural network (CNN). ASC is the task of recognizing a scenery (e.g. bus, park, office) from the sounds it produces (e.g. engine noise, birds chirping, typing sounds). In our setting, the goal is to make the headphones context-aware to enhance user experience. We capture audio from the microphone of the headphones and run the CNN on their hardware to perform classification in real time. A challenging aspect of the task is the lack of recordings coming from the microphone of the headphones, which forces us to resort to external data sources: this can be problematic since training on audio acquired from a different microphone than the one used in the final device may cause a data distribution shift and impact the classification performance (a phenomenon known as "device mismatch").
Moreover, because of the embedded environment, it is only possible to use a CNN of low complexity, which may be limiting in terms of modeling accuracy
Relators
Publication type
URI
![]() |
Modify record (reserved for operators) |
