Interpreting the output of Transformer-based architectures for text summarization

Juan Jose Marquez Villacis

Interpreting the output of Transformer-based architectures for text summarization.

Rel. Giuseppe Rizzo. Politecnico di Torino, Corso di laurea magistrale in Data Science And Engineering, 2023

Preview	PDF (Tesi_di_laurea) - Tesi Licenza: Creative Commons Attribution Non-commercial No Derivatives. Download (2MB) \| Preview
	Archive (ZIP) (Documenti_allegati) - Altro Licenza: Creative Commons Attribution Non-commercial No Derivatives. Download (833kB)

Abstract

Current state of the art models in autonomous summarization tasks utilize Transformer-based architectures trained on large corpora. The novelty of these architecture is in the use of the attention mechanism that links the generated output to input tokens and weighs more the parts that are more relevant for fitting the task. The interpretation of these attention scores provides a first step in interpreting the inner-workings of these architectures. However, recent studies demonstrated that attention scores are not exhaustive in providing an interpretation of the output of these systems. This thesis investigates the role of explainable mechanisms of transformer-based architectures designed for text summarization.