polito.it
Politecnico di Torino (logo)

Text-to-SQL for Fact Extraction and Verification Over Tabular Data

Ali Yassine

Text-to-SQL for Fact Extraction and Verification Over Tabular Data.

Rel. Luca Cagliero, Simone Papicchio. Politecnico di Torino, Corso di laurea magistrale in Data Science And Engineering, 2024

[img] PDF (Tesi_di_laurea) - Tesi
Restricted to: Repository staff only until 31 October 2025 (embargo date).
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (882kB)
Abstract:

This thesis presents an integrated approach for extracting and validating facts from tabular data sourced from the Feverous dataset, which comprises Wikipedia pages. Beyond translating natural language queries into SQL statements optimized for dataset structures, this methodology integrates a retriever component to efficiently identify relevant data entries. The study evaluates the efficacy of this integrated approach in accurately extracting and validating facts from tabular data. Furthermore, the research explores the application of retrieval models, followed by sequence-to-sequence (seq2seq) models and language model (LLM) prompts for constructing knowledge bases, extracting information, and implementing query answering systems.

Relators: Luca Cagliero, Simone Papicchio
Academic year: 2024/25
Publication type: Electronic
Number of Pages: 59
Subjects:
Corso di laurea: Corso di laurea magistrale in Data Science And Engineering
Classe di laurea: New organization > Master science > LM-32 - COMPUTER SYSTEMS ENGINEERING
Aziende collaboratrici: Politecnico di Torino
URI: http://webthesis.biblio.polito.it/id/eprint/33097
Modify record (reserved for operators) Modify record (reserved for operators)