polito.it
Politecnico di Torino (logo)

Causal-Aware RAG in Industrial Support: Mining PCS Triplets from Technical Emails

Andrea Bioddo

Causal-Aware RAG in Industrial Support: Mining PCS Triplets from Technical Emails.

Rel. Flavio Giobergia. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2025

Abstract:

Manufacturers of industrial machinery face a persistent challenge in managing technical knowledge for after-sales support. Expertise remains dispersed across unstructured email exchanges and poorly organized documentation, leading to slow responses, inconsistent service quality and progressive loss of know-how. While recent advances in Retrieval-Augmented Generation (RAG) have enabled document-grounded assistants, current systems struggle to capture the causal dependencies linking problems, causes and solutions that support technical reasoning. Moreover, causal extraction methods are hindered by limited annotated data and insufficient integration with external knowledge bases, often resulting in uncontrolled hallucinations and weak factual grounding. This thesis addresses these limitations through two complementary contributions. The first, Neuratio, is a multi-agent RAG platform that integrates seamlessly into existing email workflows, combining historical tickets, technical manuals and spare parts catalogs to generate context-aware, AI-assisted responses. Its performance, however, depends on the quality and completeness of the underlying knowledge base. To address this dependency, the second and core contribution, ExtractKB, automatically constructs a structured knowledge base by mining Problem–Cause–Solution triplets from historical conversations. The system employs a multi-turn causal reasoning approach inspired by Causal-Chain Prompting (C²P), combined with RAG-based validation and iterative self-correction. The evaluation comprises both a qualitative field study and a quantitative experiment. Neuratio was deployed in industrial environments and received positive feedback on efficiency and usability from field technicians. ExtractKB was benchmarked on mixed datasets of real and synthetic customer emails, demonstrating that multi-stage prompting and RAG validation jointly reduce hallucinations and enhance factual consistency compared to single-prompt baselines. The proposed methodology thus enables reliable causal knowledge extraction in low-resource industrial domains, bridging the gap between LLM research and real-world technical support.

Relatori: Flavio Giobergia
Anno accademico: 2025/26
Tipo di pubblicazione: Elettronica
Numero di pagine: 89
Informazioni aggiuntive: Tesi secretata. Fulltext non presente
Soggetti:
Corso di laurea: Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering)
Classe di laurea: Nuovo ordinamento > Laurea magistrale > LM-32 - INGEGNERIA INFORMATICA
Aziende collaboratrici: Politecnico di Torino
URI: http://webthesis.biblio.polito.it/id/eprint/38606
Modifica (riservato agli operatori) Modifica (riservato agli operatori)