polito.it
Politecnico di Torino (logo)

Improving Document Summarization Using Crosslingual Word Embeddings

Catia Blengino

Improving Document Summarization Using Crosslingual Word Embeddings.

Rel. Luca Cagliero, Paolo Garza. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2020

[img]
Preview
PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (2MB) | Preview
Abstract:

In recent years, due to the increase of information available online in multiple languages and the inability of a user to examine it manually, several text summarization techniques have been developed. This thesis proposes a new methodology to extract significant sentences from a collection of textual documents written in multiple languages. Specifically, it aims at extracting a summary in any of the source languages by exploiting also the semantic relationships between cross-lingual content. To this purpose, it exploits aligned word embedding models to extract cross-lingual relationships and a graph-based approach to pick the most significant sentences. The results demonstrate that using cross-lingual text correlations improves summarizer performance.

Relatori: Luca Cagliero, Paolo Garza
Anno accademico: 2019/20
Tipo di pubblicazione: Elettronica
Numero di pagine: 83
Soggetti:
Corso di laurea: Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering)
Classe di laurea: Nuovo ordinamento > Laurea magistrale > LM-32 - INGEGNERIA INFORMATICA
Aziende collaboratrici: NON SPECIFICATO
URI: http://webthesis.biblio.polito.it/id/eprint/14348
Modifica (riservato agli operatori) Modifica (riservato agli operatori)