polito.it
Politecnico di Torino (logo)

Developing an Enterprise Chatbot using Machine Learning Models: A RAG and NLP based approach

Fabio Rizzi

Developing an Enterprise Chatbot using Machine Learning Models: A RAG and NLP based approach.

Rel. Paolo Garza. Politecnico di Torino, Corso di laurea magistrale in Data Science And Engineering, 2024

[img]
Preview
PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (1MB) | Preview
Abstract:

Process automation has become essential to optimize efficiency in today's digital age. With the increasing complexity and amount of data to be managed, automation allows repetitive and time-consuming tasks to be speeded up. This work aims to create an innovative chatbot to automate specific business processes. The aim is to develop a conversation system capable of processing complex information and providing comprehensive answers in textual, tabular or graphic formats, extracted directly from company documents or databases. The most advanced open-source machine-learning models were used for the development of the chatbot. Several models were tested together with search optimization methodologies, such as RAG (Retrieval Augmented Generation) and its variants, to improve the efficiency and quality of the results. Data preprocessing techniques were also implemented to maximize performance on both the data and the entire pipeline. Experiments to identify optimal methods and parameters were conducted on business data from documents and relational databases. At the end of the experiments, different task-specific models were selected: one model for SQL querying, one for information retrieval, and another for natural language generation. The tests produced promising results, leading to the identification of two main pipelines: one with optimal performance but a longer execution time, and another less accurate but significantly faster. In the end, the faster method was chosen, as the slight reduction in accuracy was deemed acceptable against the improvement in response speed. Taking into account the limiting hardware constraints and specific business requirements, the chatbot was designed to operate effectively in a real production environment, providing fast and accurate support to users. During development, several advanced techniques were integrated to optimize RAG, improving both the quality of information retrieval and the overall interaction with the system. In conclusion, this thesis proposed a solution for the creation of machine learning-based chatbots, incorporating novel techniques not commonly used in traditional approaches. This demonstrated the effectiveness of a highly automated and optimized framework. The results obtained open up new opportunities for the development of more advanced and customized human-computer interaction systems.

Relators: Paolo Garza
Academic year: 2024/25
Publication type: Electronic
Number of Pages: 74
Subjects:
Corso di laurea: Corso di laurea magistrale in Data Science And Engineering
Classe di laurea: New organization > Master science > LM-32 - COMPUTER SYSTEMS ENGINEERING
Aziende collaboratrici: BETACOM SRL
URI: http://webthesis.biblio.polito.it/id/eprint/33210
Modify record (reserved for operators) Modify record (reserved for operators)