polito.it
Politecnico di Torino (logo)

Recording of ecological audiovisual scenes for the Audio Space Lab of the Politecnico di Torino

Ignazio Ligani

Recording of ecological audiovisual scenes for the Audio Space Lab of the Politecnico di Torino.

Rel. Arianna Astolfi, Marco Carlo Masoero. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Del Cinema E Dei Mezzi Di Comunicazione, 2023

[img]
Preview
PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (36MB) | Preview
Abstract:

Nowadays, a large portion of the elder population suffers from hearing loss, negatively impacting the life quality. Therefore, hearing-impaired older adults are often fitted with hearing aids (HAs). However, many complain of insufficient hearing support, especially in complex listening environments. One cause can be found in the standard audiometric tests used to assess HAs performance, which do not account for the actual conditions in which the listening mechanism is activated in real life, as the complexity of the acoustical scenarios and the influence of other senses like the visual component. Thus, to perform ecological listening tests, i.e., tests involving everyday-life conditions, recently, Virtual Reality (VR) systems have been used to reproduce listening tests inside virtual sound environments, representing most frequently attended environments (FAEs) with multiple varying noise sources (e.g. cafes, stations, supermarkets). To support the development of ecological tests, different databases have been published comprising collections of audiovisual simulations or spatial audio recordings of real-life acoustical environments. However, none of them provide in-field recordings of both visual and auditory background, necessary to have a full representation of real-world conditions. Hence, the goal of this thesis was to acquire spatial audiovisual recordings that achieve a high degree of visual and auditory realism and allow to properly test listeners with different noise sources coming from specific surrounding directions. The created scenes are then meant to be played inside the Audio Space Lab room of the Polytechnic of Turin, which hosts a 360°-audiovisual reproduction system composed of a spherical cup of 16-loudspeakers (LSs), able to reproduce audio tracks up to the 3rd-order ambisonics (3OA) encoding, synced with the Oculus Quest 2 VR headset, used to stream the visual content. The main criteria found in the literature that define environmental complexity were considered to select the FAE scenes to be recorded; that is, noise type, position, and distance of target- and noise-sources. Then, to acquire spatial audio recordings that can be played on any reproduction system, the 3OA encoding was chosen, which generically decomposes the sampled sound field into 16 spherical harmonics that can be easily mapped onto any LSs array. After identifying all the possible environments, two of them were selected to start acquiring the audiovisual scenarios: a conference room and a classroom. To allow the auralization of speech audiometry tests inside these environments combined with different kind of noises and directions, multiple Room Impulse Responses (RIRs) were recorded through the Zylia ZM-1 19-capsules microphone, each for a different configuration of listeners-to-target and -to-noise positions. Similarly, multiple 6K stereoscopic 360° videos sampling the visual scenes were taken using the Insta360 Pro camera. Post-production procedures for both audio and video tracks were then implemented to erase the unwanted equipment inside the visual scenes and to acoustically compose the different scenes through the convolution of the RIRs with anechoic target and noise signals. As future works, tests on a pool of normal-hearing subjects are planned to validate the created audiovisual scenes. Moreover, further scenes will be recorded also involving the synchronization between sound and lips movement, at first excluded in the current work.

Relatori: Arianna Astolfi, Marco Carlo Masoero
Anno accademico: 2022/23
Tipo di pubblicazione: Elettronica
Numero di pagine: 88
Soggetti:
Corso di laurea: Corso di laurea magistrale in Ingegneria Del Cinema E Dei Mezzi Di Comunicazione
Classe di laurea: Nuovo ordinamento > Laurea magistrale > LM-32 - INGEGNERIA INFORMATICA
Aziende collaboratrici: NON SPECIFICATO
URI: http://webthesis.biblio.polito.it/id/eprint/26920
Modifica (riservato agli operatori) Modifica (riservato agli operatori)