Politecnico di Torino (logo)

Detecting fake news using Natural Language Processing

Giulio Alfarano

Detecting fake news using Natural Language Processing.

Rel. Elena Maria Baralis, Rapahel Troncy. Politecnico di Torino, Corso di laurea magistrale in Data Science And Engineering, 2022

PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (3MB) | Preview

The enormous dissemination of data and information enabled by the Internet has brought great freedom of expression and information to almost the entire world population. However, at the same time the spread of information also brings with it the spread of false information, known in the field as "Fake news" or "misinformation". This phenomenon has led the scientific community to become interested in and develop solutions that can curb the problem: a fundamental building block is Natural Language Processing, an algorithmic technique developed through Machine Learning and artificial intelligence in general. In this work we address two challenges carried out during 2021, FEVEROUS and MediaEval 2021, that concern the identification of fake news through natural language processing and the most widely used models for text classification. The first challenge dealt with the truthfulness of various claims within an ad hoc provided database by retrieving evidence from the English Wikipedia corpus; on the other hand, the second challenge concerned the identification of misinformation in a dataset composed of real tweets about the recent Covid-19 pandemic and related conspiracy theories. Among the tested models, the most explored techniques are the classification algorithms following a pre-processing and encoding process of the text and, secondly, the Transformers, considered the current state-of-art for the main Natural Language Processing tasks, such as the one analysed, but also used in other fields. The promising results show that this type of solution can be an excellent tool to help the community face this problem from a social and technical point of view.

Relators: Elena Maria Baralis, Rapahel Troncy
Academic year: 2021/22
Publication type: Electronic
Number of Pages: 93
Corso di laurea: Corso di laurea magistrale in Data Science And Engineering
Classe di laurea: New organization > Master science > LM-32 - COMPUTER SYSTEMS ENGINEERING
Ente in cotutela: EURECOM (FRANCIA)
Aziende collaboratrici: Eurecom
URI: http://webthesis.biblio.polito.it/id/eprint/22641
Modify record (reserved for operators) Modify record (reserved for operators)