polito.it
Politecnico di Torino (logo)

Scientific Papers Slide Generation using Abstractive Text Summarization

Simone Manni

Scientific Papers Slide Generation using Abstractive Text Summarization.

Rel. Luca Cagliero, Moreno La Quatra. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2021

[img]
Preview
PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (6MB) | Preview
Abstract:

Slides-based presentations are increasingly important in scientific dissemination as they incorporate several useful information for publication understanding. They usually contains short summaries of the main paper contributions and cover all the sections of the original publication. Manually generating slides content, however, is an expensive task. Recent advancements in machine learning and artificial intelligence allowed the creation of automatic systems that aims at generating summaries from scientific articles. Those summaries can be used to reduce the amount of content that requires manual analysis. Limited research efforts have been devoted to the scope of generating presentation slides from document. Indeed, this task faces the scarcity of publicly available material for benchmarking. This master thesis proposes a new dataset, APPreD, consisting of pairs of papers and their corresponding presentation slides crawled from ACL online anthology. It also proposes a deep-learning based approach that addresses document-to-slide task. The proposed methodology entails (i) the classification of academic content in IMRaD classes, (ii) the fine-tuning of a pre-trained model for abstractive summarization of section content. The proposed methodology has been trained and tested using benchmark data collections. The implemented system relies only on textual domain and can be further developed in order to be able to address multimodal domain including multimedia objects. It outperforms state-of-the-art summarization baselines according to several standard evaluation metrics.

Relatori: Luca Cagliero, Moreno La Quatra
Anno accademico: 2021/22
Tipo di pubblicazione: Elettronica
Numero di pagine: 62
Soggetti:
Corso di laurea: Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering)
Classe di laurea: Nuovo ordinamento > Laurea magistrale > LM-32 - INGEGNERIA INFORMATICA
Aziende collaboratrici: NON SPECIFICATO
URI: http://webthesis.biblio.polito.it/id/eprint/21086
Modifica (riservato agli operatori) Modifica (riservato agli operatori)