Vincenzo Montana
Human-Aligned Speech Language Models with Preference Alignment Data Collection.
Rel. Eliana Pastor, Alkis Koudounas. Politecnico di Torino, Master of science program in Computer Engineering, 2025
|
|
PDF (Tesi_di_laurea)
- Thesis
Restricted to: Only staff users fino al 12 June 2027 (data di embargo). Licence: Creative Commons Attribution Non-commercial No Derivatives. Download (2MB) |
Abstract
Preference alignment techniques have achieved remarkable results in aligning Large Language Models (LLMs) with human values through output comparison. However, these methods critically rely on human-annotated preference data, whose collection remains a major challenge due to scalability and consistency issues. The complexity further increases in the multi-modal domain, where annotators may focus on isolated aspects of a given modality (e.g., speech tone or rhythm) rather than its overall communicative intent. Positioned within this context, the present work specifically addresses these challenges in the speech domain. The primary goal is to collect human preference data on speech-based interactions, ensuring that annotators are properly guided to provide consistent and meaningful feedback.
To this end, a multi-stage speech pipeline, emulating a full conversation with a digital assistant, was designed
Relators
Academic year
Publication type
Number of Pages
Course of studies
Classe di laurea
Aziende collaboratrici
URI
![]() |
Modify record (reserved for operators) |
