polito.it
Politecnico di Torino (logo)

RAG system for automatic report generation

Pietro Noto

RAG system for automatic report generation.

Rel. Daniele Apiletti, Simone Monaco. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2025

Abstract:

Retrieval-Augmented Generation (RAG) has emerged as a powerful approach to enhancing the accuracy and relevance of responses in natural language processing applications. This thesis explores the development and evaluation of an advanced RAG system tailored for a company specializing in survey analysis within the dental products industry. The primary goal of the system is to answer questions—ranging from simple factual inquiries to complex analytical queries—based on survey data collected from various business partners, in order to better and, expecially, easier understand various dental market actors and their relations. For instance, it will be much easier to simply ask the system “What was the most preferred brand for *DENTAL PRODUCT*_ among dentists in *COUNTRY* in *YEAR*?” instead of manually searching inside a database. A fundamental challenge in RAG-based question answering (Q&A) systems lies in the retrieval process: ensuring that only the most relevant documents are selected to inform the generative model. A document is considered “relevant” if it contains portions of text which are semantically similar to the user query. This research compares two types of approaches: a naive RAG implementation, which retrieves information from a the entire document set, and advanced RAG approaches that selectively retrieve data from the most relevant documents, taken from a propely filtered subset of the initial database. The thesis demonstrates that a more precise special-purpose retrieval mechanism significantly enhances response accuracy, coherence, and informativeness, while reducing hallucinations, over an out-of-the-shelf naive approach. The results of this study have significant implications for businesses relying on survey-based insights. By deploying an advanced RAG system, organizations in the dental products industry can derive more accurate and actionable insights from their survey data, ultimately improving decision-making and business strategies. The findings of this research contribute to the broader field of retrieval-augmented generation by demonstrating that sophisticated retrieval mechanisms and tailored document selection strategies play a crucial role in maximizing the effectiveness of RAG-based Q&A systems.

Relatori: Daniele Apiletti, Simone Monaco
Anno accademico: 2024/25
Tipo di pubblicazione: Elettronica
Numero di pagine: 85
Informazioni aggiuntive: Tesi secretata. Fulltext non presente
Soggetti:
Corso di laurea: Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering)
Classe di laurea: Nuovo ordinamento > Laurea magistrale > LM-32 - INGEGNERIA INFORMATICA
Aziende collaboratrici: Politecnico di Torino
URI: http://webthesis.biblio.polito.it/id/eprint/35281
Modifica (riservato agli operatori) Modifica (riservato agli operatori)