Mohamad Samaei
Data Collection and Generation for Preference Alignment in Speech Language Models.
Rel. Eliana Pastor, Alkis Koudounas. Politecnico di Torino, Master of science program in Data Science And Engineering, 2025
|
|
PDF (Tesi_di_laurea)
- Thesis
Restricted to: Only staff users fino al 12 June 2027 (data di embargo). Licence: Creative Commons Attribution Non-commercial No Derivatives. Download (1MB) |
Abstract
Despite recent advances, models of Automatic Speech Recognition (ASR) and Text-to-Speech (TTS) still misrecognize words in challenging conditions, limiting their ability. Reinforcement Learning from Human Feedback (RLHF) limits hallucinations in speech models by replacing purely statistical learning with a human-aligned optimization objective that rewards factual, grounded, and faithful outputs while penalizing hallucinated content. Current speech assistants are typically trained on proprietary data and use metrics such as Word Error Rate (WER) to prove their performance. At the same time, RLHF methods have largely focused on text-only models, leaving a gap in tools and datasets for applying preference alignment training to spoken dialogue systems.
This thesis addresses these gaps by presenting open-source implementation of (i) a data generation and extraction pipeline for conversational speech agents and (ii) an annotation platform for collecting human feedback, with the goal of enabling RLHF in speech models
Relators
Academic year
Publication type
Number of Pages
Course of studies
Classe di laurea
URI
![]() |
Modify record (reserved for operators) |
