Mohamad Samaei
Data Collection and Generation for Preference Alignment in Speech Language Models.
Rel. Eliana Pastor, Alkis Koudounas. Politecnico di Torino, Corso di laurea magistrale in Data Science And Engineering, 2025
|
|
PDF (Tesi_di_laurea)
- Tesi
Accesso limitato a: Solo utenti staff fino al 12 Giugno 2027 (data di embargo). Licenza: Creative Commons Attribution Non-commercial No Derivatives. Download (1MB) |
Abstract
Despite recent advances, models of Automatic Speech Recognition (ASR) and Text-to-Speech (TTS) still misrecognize words in challenging conditions, limiting their ability. Reinforcement Learning from Human Feedback (RLHF) limits hallucinations in speech models by replacing purely statistical learning with a human-aligned optimization objective that rewards factual, grounded, and faithful outputs while penalizing hallucinated content. Current speech assistants are typically trained on proprietary data and use metrics such as Word Error Rate (WER) to prove their performance. At the same time, RLHF methods have largely focused on text-only models, leaving a gap in tools and datasets for applying preference alignment training to spoken dialogue systems.
This thesis addresses these gaps by presenting open-source implementation of (i) a data generation and extraction pipeline for conversational speech agents and (ii) an annotation platform for collecting human feedback, with the goal of enabling RLHF in speech models
Relatori
Anno Accademico
Tipo di pubblicazione
Numero di pagine
Corso di laurea
Classe di laurea
URI
![]() |
Modifica (riservato agli operatori) |
