Supervised and Contextual Fine-Tuning for Text-to-SQL on Enterprise Databases: An Orchestrated LLM Pipeline

Marco Pontrandolfo

Supervised and Contextual Fine-Tuning for Text-to-SQL on Enterprise Databases: An Orchestrated LLM Pipeline.

Rel. Paolo Garza. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2026

Preview

PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.
Download (3MB) | Preview

Abstract

Large Language Models (LLMs) have recently demonstrated strong capabilities in natural language understanding and generation, enabling new forms of interaction between users and data systems. Among these applications, text-to-SQL generation represents a particularly relevant use case in enterprise environments, where relational databases remain the backbone of data storage and business analytics. However, directly applying general-purpose LLMs to complex business databases poses significant challenges, including schema complexity, SQL dialect constraints, domain-specific business logic, and cost-efficiency considerations. This thesis investigates the effectiveness of different fine-tuning strategies for adapting LLMs to text-to-SQL tasks in enterprise-like settings. In particular, Supervised Fine-Tuning (SFT) and Contextual Fine-Tuning (Context FT) are systematically compared across multiple configurations.

Experiments are conducted using a realistic relational database derived from the Northwind schema, populated with synthetic data to simulate business-scale scenarios