Catia Blengino
Improving Document Summarization Using Crosslingual Word Embeddings.
Rel. Luca Cagliero, Paolo Garza. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2020
|
PDF (Tesi_di_laurea)
- Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives. Download (2MB) | Preview |
Abstract: |
In recent years, due to the increase of information available online in multiple languages and the inability of a user to examine it manually, several text summarization techniques have been developed. This thesis proposes a new methodology to extract significant sentences from a collection of textual documents written in multiple languages. Specifically, it aims at extracting a summary in any of the source languages by exploiting also the semantic relationships between cross-lingual content. To this purpose, it exploits aligned word embedding models to extract cross-lingual relationships and a graph-based approach to pick the most significant sentences. The results demonstrate that using cross-lingual text correlations improves summarizer performance. |
---|---|
Relatori: | Luca Cagliero, Paolo Garza |
Anno accademico: | 2019/20 |
Tipo di pubblicazione: | Elettronica |
Numero di pagine: | 83 |
Soggetti: | |
Corso di laurea: | Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering) |
Classe di laurea: | Nuovo ordinamento > Laurea magistrale > LM-32 - INGEGNERIA INFORMATICA |
Aziende collaboratrici: | NON SPECIFICATO |
URI: | http://webthesis.biblio.polito.it/id/eprint/14348 |
Modifica (riservato agli operatori) |