Francesco Giannuzzo
Automatic Generation of Tool-Use Traces for Evaluating LLM Agents.
Rel. Paolo Garza, Paolo Papotti. Politecnico di Torino, Corso di laurea magistrale in Data Science And Engineering, 2026
Abstract
Autonomous agents powered by large language models (LLMs) are increasingly expected to operate over structured data environments via external tools. Yet evaluating their capabilities, such as planning, tool selection, ambiguity handling, and multi-step execution, remains challenging. Existing benchmarks often depend on manually crafted scenarios, which are costly to create and hard to scale. This thesis introduces SyntheticAgentTraceQA, a pipeline that automatically generates, executes, and validates synthetic task–tool interaction traces grounded in real data. The pipeline follows a top-down strategy: instead of starting from a natural language query, it first constructs abstract operational templates using a taxonomy of tool roles (e.g., entity resolution, data retrieval, analysis, aggregation).
These templates are instantiated and executed, and the resulting traces are validated for correctness
Relatori
Anno Accademico
Tipo di pubblicazione
Numero di pagine
Informazioni aggiuntive
Corso di laurea
Classe di laurea
Ente in cotutela
Aziende collaboratrici
URI
![]() |
Modifica (riservato agli operatori) |
