Fabrizio Battiloro
Assessing VQA Model Reliability Through Systematic Evaluation of Corrupted Question Handling.
Rel. Luca Cagliero, Davide Napolitano. Politecnico di Torino, Corso di laurea magistrale in Data Science And Engineering, 2025
|
|
PDF (Tesi_di_laurea)
- Tesi
Accesso limitato a: Solo utenti staff fino al 11 Ottobre 2026 (data di embargo). Licenza: Creative Commons Attribution Non-commercial No Derivatives. Download (6MB) |
Abstract
Assessing the ability of Visual Question Answering Models (VQAs) to handle question answering (QA) in multi-page documents requires addressing a key limitation: their vulnerability to distorted input. When questions contain typographical errors or incorrect references, VQAs often fail to recognize that these seemingly valid queries are actually unanswerable. This challenge is amplified in visually rich documents, where multimodal elements like figures, tables, and complex layouts introduce additional layers of ambiguity. In this thesis project, a new framework is introduced, specifically designed to test VQAs’ robustness against corrupted questions. Unlike traditional VQA benchmarks, which provide little consideration for distorted inputs, the framework systematically alters questions at various levels—manipulating linguistic entities, document structures, and visual layouts.
A preliminary verification step ensures that these modifications produce genuinely unanswerable questions before they are used for evaluation
Relatori
Anno Accademico
Tipo di pubblicazione
Numero di pagine
Corso di laurea
Classe di laurea
URI
![]() |
Modifica (riservato agli operatori) |
