Politecnico di Torino (logo)

Integrating news sentiment analysis into quantitative stock trading data

Vincenzo Savarese

Integrating news sentiment analysis into quantitative stock trading data.

Rel. Luca Cagliero, Paolo Garza. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2019

PDF (Tesi_di_laurea) - Tesi
Document access: Anyone
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (932kB) | Preview

Wouldn’t it be great if we could teach a computer to perceive the underlying feeling of a text and exploit that information in stock price forecasting? Are there words that have a more significant influence on the daily stock price change? This research study addresses these questions in the context of stock price forecasting based on news sentiment analysis, using a quantitative trading system. Thanks to increasing openness and availability of electronic information of last years, several types of research addressed the usage of financial text news to sharpen to the best stock price prediction. However, what makes unique and challenging our work is the scale to which this analysis was applied. To the best of our knowledge, this work is one of the most large scale analysis applying text-mining techniques to stock price forecasting, using a quantitative trading framework. A quantitative trading model applies trading strategies relying on quantitative analysis, which makes excellent use of both mathematical computations and statistical indicators to identify patterns in market stocks. Sentiment analysis, on the other hand, is a text mining process that tries to extract subjective information from text. A text has a positive sentiment if, for instance, the sentiment analysis process detected more positive than negative words. The hearth of this research is the choice of news-based variables, deriving from the news sentiment analysis phase. Those characterising variables help the quantitative model in the stock prediction process. The more those variables are descriptive the higher the probability of making a profit. This choice is divided into two phases: a priori variable definition and final news-based features selection. In the first phase, we define some variables that we expect to reflect to the best news sentiment. Later, in the second phase, statistical measures are applied to select most representative and predictive variables in order to maximise their influence on the forecasting model. To infer the effectiveness of the news sentiment addition, we did experiments with and without final news-based features. Results with news sentiment addition significantly outperformed the ones without the addition, with an average total profit increment of nearly 30%.

Relators: Luca Cagliero, Paolo Garza
Academic year: 2018/19
Publication type: Electronic
Number of Pages: 73
Corso di laurea: Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering)
Classe di laurea: New organization > Master science > LM-32 - COMPUTER SYSTEMS ENGINEERING
Aziende collaboratrici: UNSPECIFIED
URI: http://webthesis.biblio.polito.it/id/eprint/11591
Modify record (reserved for operators) Modify record (reserved for operators)