Riccardo Vellano
Covid-19 and the Financial Markets: Analysis of the Correlation Between Tweets Sentiment in the United States and the Return of a Portfolio during the Market Crash.
Rel. Tania Cerquitelli. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Matematica, 2022
|
PDF (Tesi_di_laurea)
- Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives. Download (1MB) | Preview |
Abstract: |
This Master Thesis research investigates the possibility of predicting markets fluctuations during the first months of the COVID-19 pandemic in order to enhance the returns of a portfolio constituted of sectorial ETFs. As previously researched by many authors and argued by Gu and Kurov (2020), investors rely on social media sentiment and information to make investment decisions. Over the years, sentiment analysis of social media posts for stock market prediction has attracted a lot of interest and this was demonstrated by many papers researching the topic. As argued and demonstrated by Bollen et al. (2011) in “Twitter mood predicts the stock market”, a pivotal paper in this field of research, it is possible to predict stock market fluctuations by analysing the sentiment on Twitter. To this purpose, in this research around 1.4 million tweets are downloaded from the social network via an Academic Research Developer account. All tweets were posted between 23rd February 2020 and 31st May 2020 from the United States, in English and they focus and comment on the topic of COVID-19. An exploratory analysis of the dataset shows that the tweets downloaded express feelings, opinions, comments or report news related to the pandemic, but they were chosen to be posted by generic user with the specific goal of having an unbiased dataset. Via the use of a cutting-hedge Natural Language Processing algorithm architecture called Transformers, the sentiment is extracted for every tweet. It can be positive, negative or neutral based on the words used in the tweets and the feelings expressed. In parallel, the research focuses on sectorial Exchange Traded Funds, or ETFs, which are funds traded on the markets that aim at replicating the performance of certain securities. In this case, the ETFs are used to replicate the performance of specific sectors which were particularly influenced by the pandemic: the industrial, financial, healthcare and technological sectors. The returns of these ETFs represent the returns achieved by an investor willing to have a financial exposure to the aforementioned sectors. The 30-minutes intervals performance of the ETFs is linked to the sentiment expressed on Twitter during the same timelapse to create a dataset to train algorithmic models for prediction. The study of precedent research inspired the use of supervised learning models such as k-Nearest Neighbours and Support Vector Machines to forecast directional fluctuations in the ETFs price. It is shown that the kNN algorithm performs best with around 58 % accuracy on the training dataset. By using the directional forecasts performed by the model on test dataset composted of May 2020 tweets and ETFs performance, it is shown that these predictions help improve the returns of a portfolio invested in the Industrial Select Sector SPDR Fund (XLI) ETF during the COVID-19 from 7.33 % to 15.84 %. Therefore, an improvement of 8.51 % in the portfolio return is achieved when using insights produced by the sentiment analysis of tweets combined with supervised machine learning algorithms. All the algorithms are coded in Python and have been developed independently with the use, where indicated, of pre-compiled Python libraries especially for the Transformer architecture, SVM and kNN models. In conclusion, this Master Thesis research suggests the possibility of sentiment analysis-based yield enhancement strategies taking advantage of the information shared on Twitter by its users, rooted on the so called Wisdom of Crowds, Surowiecki (2004) |
---|---|
Relatori: | Tania Cerquitelli |
Anno accademico: | 2022/23 |
Tipo di pubblicazione: | Elettronica |
Numero di pagine: | 48 |
Soggetti: | |
Corso di laurea: | Corso di laurea magistrale in Ingegneria Matematica |
Classe di laurea: | Nuovo ordinamento > Laurea magistrale > LM-44 - MODELLISTICA MATEMATICO-FISICA PER L'INGEGNERIA |
Aziende collaboratrici: | NON SPECIFICATO |
URI: | http://webthesis.biblio.polito.it/id/eprint/24861 |
Modifica (riservato agli operatori) |