polito.it
Politecnico di Torino (logo)

Extraction and selection of vocal features for the assessment of surgeries and rehabilitation of post laryngectomy patients

Giulia Resio

Extraction and selection of vocal features for the assessment of surgeries and rehabilitation of post laryngectomy patients.

Rel. Alessio Carullo, Alberto Vallan. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Biomedica, 2022

[img]
Preview
PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (1MB) | Preview
Abstract:

Open Partial Horizontal Laryngectomies (OPHL) are diffuse surgeries for laryngeal carcinomas, leading to post-intervention complications in the execution of primary activities, as phonatory abilities. In worst cases, the surgery encompasses the removal of both vocal cords (type II, III OPHL) and the outcome is a very hoarse and breathy voice, named “substitution voice”. Patients must follow a rehabilitation path to partially restore the abilities impaired by the surgical procedures, and auditory perceptual evaluation scales as the INFVo are commonly used in the clinical field to assess the effectiveness of the rehabilitation on the voice quality. The goal of this work is to define a procedure based on voice analysis of patients, by extracting representative parameters and providing objective data on rehabilitation results. The data set used in this thesis was supplied by San Giovanni Bosco Hospital (Turin) and consists of 85 patients divided among the type of operations they underwent: 22 for OPHL-I, 32 for OPHL-II, and 31 for OPHL-III. All the acquisitions were made with an in-air microphone system and include vocalization of the sustained vowel /a/ and a phonetically balanced speech for each patient. First, signals were pre-processed with the software Audacity and Matlab (R2022a); then parameters were extracted for the whole recordings, but only those related to harmonic frames were considered for feature extraction. The harmonic frames were selected using two different criteria: the first one is based on the Harmonic-to-Noise Ratio (HNR) and the second one on the Spectral Kurtosis (SK). Examples of extracted features are SK, HNR and fundamental frequency (f0), and other parameters in the spectral and cepstral domains, such as Cepstral Peak Prominence Smoothed (CPPS) and Mel-Frequency Cepstral Coefficients (MFCC). Each parameter is represented as a probability distribution, through descriptive statistics (indices of central tendency and range, measures of variability). To these, nine parameters were added for the vowel /a/ to evaluate period and amplitude stability, resulting in 198 features for the vowel /a/ and 189 for the balanced speech. Data were classified by the Logistic Regression (LR) model, by comparing first the type of intervention (OPHL I vs OPHL II, III) and then the patients within the worst cases (OPHL II, III) by dividing them into two classes based on index I (intelligibility) of the INFVo scale. Feature selection relied on the accuracy (Acc) or Area Under The Curve (in case of a tie) of the LR model, trained using a single feature and then a combination of 2,3,4 features with low (R2< 0.5) and statistically significant (p??value< 0.05) correlation. Eventually, a method was proposed to quantify the role of the expanded uncertainty U(p) of the probability p provided by the LR model, considering variances and covariances of model parameters; confidence intervals were created for each probability, thus the "non-classified" class was introduced, to be excluded in Accuracy evaluations. New metrics as Fraction Of Classified (Foc) and Realistic Accuracy (Accreal) were proposed to test classification performances. Classification gave good results, mainly by SK method, balanced speech, OPHL I vs II,III with Acc values up to 96.5%, selecting Spectral Entropy (95-th percentile), f0 (5,95-th percentile). New metrics were effective. For instance, a case of HNR method, balanced speech, OPHL I vs II,III selecting f0 (range), HNR (skewness): Acc=94.1%, Accreal=95.9%, Foc=0.87.

Relatori: Alessio Carullo, Alberto Vallan
Anno accademico: 2022/23
Tipo di pubblicazione: Elettronica
Numero di pagine: 82
Soggetti:
Corso di laurea: Corso di laurea magistrale in Ingegneria Biomedica
Classe di laurea: Nuovo ordinamento > Laurea magistrale > LM-21 - INGEGNERIA BIOMEDICA
Aziende collaboratrici: Politecnico di Torino
URI: http://webthesis.biblio.polito.it/id/eprint/25727
Modifica (riservato agli operatori) Modifica (riservato agli operatori)