
Gabriele Tomatis
Can you hear what I’ve learned? Explaining audio transformer-based models through embedding sonification.
Rel. Eliana Pastor, Alkis Koudounas. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2025
|
PDF (Tesi_di_laurea)
- Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives. Download (8MB) | Preview |
Abstract: |
Since their introduction, transformer models showed instantly their high performances in the analysis of structured data such as images, time series and audios. Their ability in solving the most different tasks brought them to become rapidly the state of the art in a wide variety of domains. How they reason, however, is still a big issue, as they translate those data into an embedding representation that only they could comprehend. Despite that, only a few works try to solve this problem by using several methods proposed in the field of Explainable AI. The aim of this discipline is to make AI models interpretable in a way that makes them trustworthy and reliable; this would be impossible to obtain if we do not understand the way the models reason. To address these issues, we take advantage of Descript Audio VAE, a model specifically trained to compress and reconstruct an audio waveform passing through a latent space representation. In particular, we apply a gating layer between the embedding space of the model that we want to interpret and the latent space of Descript Audio VAE. This mapping converts the embeddings of the first audio transformer to the latents of the second model that can translate this unknown representation into sound: something that we can interpret. |
---|---|
Relatori: | Eliana Pastor, Alkis Koudounas |
Anno accademico: | 2024/25 |
Tipo di pubblicazione: | Elettronica |
Numero di pagine: | 73 |
Soggetti: | |
Corso di laurea: | Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering) |
Classe di laurea: | Nuovo ordinamento > Laurea magistrale > LM-32 - INGEGNERIA INFORMATICA |
Aziende collaboratrici: | Politecnico di Torino |
URI: | http://webthesis.biblio.polito.it/id/eprint/35389 |
![]() |
Modifica (riservato agli operatori) |