Politecnico di Torino (logo)

Detection of Suspicious Users Posting Claims about Cancer on Twitter

Massimo Piras

Detection of Suspicious Users Posting Claims about Cancer on Twitter.

Rel. Elena Maria Baralis. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2018

PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (1MB) | Preview

Due to the massive success of social media, online user-generated content has increased exponentially in the last years. Twitter, as a microblogging platform, allows users to share information about their opinions or activities by means of short posts called tweets. However, opinion spammers see social networks like Twitter as an opportunity to propagate their ideas, promoting or discrediting some target product or service, without showing their true intentions. In this study, we focused on detecting suspicious users who posted dubious claims about cancer treatment and prevention on Twitter. We addressed the task with a supervised learn- ing approach, a binary classi cation problem in which we had to predict whether users were suspicious or genuine. We collected a set of 60 thousand tweets related to cancer posted in October 2017, including more than 36 thousand users. Since manual labeling could be a very complicated process, we elaborated a set of features for each user, both related to the content of her posts and her behavior on Twitter, and combined them to compute a spam score. The basic idea was that suspicious users would have different feature distributions with respect to genuine users and that would help us to separate the two classes. Then, we generated a ranking using the spam score and exploited it to assign the labels. Finally, we ran a few classi ers on our labeled data, showing that suspicious users had different textual and behavioral patterns which could be used to distinguish them from genuine ones.

Relators: Elena Maria Baralis
Academic year: 2017/18
Publication type: Electronic
Number of Pages: 101
Corso di laurea: Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering)
Classe di laurea: New organization > Master science > LM-32 - COMPUTER SYSTEMS ENGINEERING
Aziende collaboratrici: UNSPECIFIED
URI: http://webthesis.biblio.polito.it/id/eprint/10038
Modify record (reserved for operators) Modify record (reserved for operators)