polito.it
Politecnico di Torino (logo)

Structured Retrieval-Augmented Generation for Enterprise Knowledge Management

Umberto Piccardi

Structured Retrieval-Augmented Generation for Enterprise Knowledge Management.

Rel. Andrea Bottino. Politecnico di Torino, Corso di laurea magistrale in Data Science And Engineering, 2025

[img]
Preview
PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (2MB) | Preview
Abstract:

This thesis addresses the problem of onboarding and knowledge retrieval in modern companies, where documentation is often voluminous, generic and fragmented across many systems. Retrieval-augmented generation (RAG) language models combine a search step with text generation: the system retrieves relevant passages from knowledge bases and feeds them to the model to produce more up-to-date and accurate responses. However, traditional RAG systems, based on simple vector or lexical search, struggle with complex questions that require linking information from different domains and synthesising it in a coherent manner. We suggest a RAG framework for industrial settings that combines a structured retrieval approach with a knowledge graph in order to overcome these drawbacks. Explicit relationships between concepts and entities are added to traditional retrieval by the graph-based design, which enables the system to reason across related data and produce more logical, context-aware responses. This method improves overall factual consistency and explainability while strengthening RAG's capacity to manage intricate, cross-domain inquiries. The RAGAs framework, a collection of LLM-based metrics intended to evaluate retrieval and generation quality, is the foundation of the evaluation. Using a standard open-domain dataset, we compared the effectiveness of the graph-augmented approach against a baseline RAG in terms of faithfulness, answer relevancy, and context precision. The results provide an initial validation of the framework’s potential before its application to enterprise documentation environments. Overall, this thesis contributes: (i) an analysis of the challenges posed by onboarding and fragmented enterprise knowledge, (ii) a graph-augmented RAG framework based on community summaries and local/global retrieval, and (iii) a holistic evaluation demonstrating the benefits of graph-based retrieval for enterprise knowledge management.

Relatori: Andrea Bottino
Anno accademico: 2025/26
Tipo di pubblicazione: Elettronica
Numero di pagine: 72
Soggetti:
Corso di laurea: Corso di laurea magistrale in Data Science And Engineering
Classe di laurea: Nuovo ordinamento > Laurea magistrale > LM-32 - INGEGNERIA INFORMATICA
Aziende collaboratrici: DATA Reply S.r.l. con Unico Socio
URI: http://webthesis.biblio.polito.it/id/eprint/38769
Modifica (riservato agli operatori) Modifica (riservato agli operatori)