polito.it
Politecnico di Torino (logo)

Beyond Cross-Entropy: Custom Loss Functions for Finetuning SLMs on Structured Recipe Generation

Mattia Ottoborgo

Beyond Cross-Entropy: Custom Loss Functions for Finetuning SLMs on Structured Recipe Generation.

Rel. Paolo Garza, Daniele Rege Cambrin. Politecnico di Torino, Corso di laurea magistrale in Data Science And Engineering, 2025

[img]
Preview
PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (11MB) | Preview
Abstract:

This thesis explores the use of custom loss functions to the finetuning of Small Language Models (SLMs) applied to recipe generation. With the exponential growth of deep learning, Large Language Models have proven themselves to have incredible text generation capabilities. Standard training frameworks, however, which employ Cross-Entropy loss, commonly disappoint in more demanding fields requiring high accuracy of facts and numbers, for instance the generation of procedural texts such as cooking recipes. This work tackles the intrinsic limitation of the standard solution that treats all the words indifferently, giving rise to the model’s inability to appropriate the important but frequently statistically scarce ingredients of a recipe. Construction of a valid recipe presents several challenges. It demands a model to generalize not only linguistic expertise but also procedural logic, recollection of facts about ingredients, and correct numerical reasoning for amounts, times, and temperatures. Incorrect generation of these crucial entities renders the output unusable and identifies a crucial disconnect between a model’s textual coherence and its real-world practical utility. This paper contends that in order for this gap to be closed, the training objective itself needs to be modified to better accommodate the specific needs of the world. The thesis first provides an overview of the foundational concepts of modern NLP, from the Transformer architecture to the lifecycle and methodologies for finetuning language models, with a focus on Parameter-Efficient Finetuning (PEFT) via Low-Rank Adaptation (LoRA). It then details the design of a composite loss framework, augmenting the standard Cross-Entropy loss with one of three custom losses: Focal Loss, to address token imbalance; Dice Loss, to optimize for semantic overlap; and a novel Topological Loss, designed to measure the geometric similarity between the predicted and ground-truth ingredient lists in the embedding space. Lastly, the thesis benchmarks the quality of several SLMs finetuned using such composite losses relative to a simple Cross-Entropy baseline. The models are assessed using a comprehensive set of evaluation metrics that includes typical NLP benchmarks and a series of ad-hoc measures designed to evaluate ingredient recall, numerical precision, and procedural correctness. The experiments document that training the objective using the domain-aware custom losses yields statistical gains that constitute a new and improved manner of finetuning language models for structured, fact-intensive generation tasks.

Relatori: Paolo Garza, Daniele Rege Cambrin
Anno accademico: 2025/26
Tipo di pubblicazione: Elettronica
Numero di pagine: 116
Soggetti:
Corso di laurea: Corso di laurea magistrale in Data Science And Engineering
Classe di laurea: Nuovo ordinamento > Laurea magistrale > LM-32 - INGEGNERIA INFORMATICA
Aziende collaboratrici: Politecnico di Torino
URI: http://webthesis.biblio.polito.it/id/eprint/38753
Modifica (riservato agli operatori) Modifica (riservato agli operatori)