Politecnico di Torino (logo)

Forecasting Public Transport Demand using Smart Cards Data

Eleonora Gastaldi

Forecasting Public Transport Demand using Smart Cards Data.

Rel. Silvia Anna Chiusano, Elena Daraio. Politecnico di Torino, Corso di laurea magistrale in Ict For Smart Societies (Ict Per La Società Del Futuro), 2021

PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (12MB) | Preview

The collection of mobility data through the validation of electronic tickets and smart cards allows to obtain personal information about users and their mobility patterns. Having this knowledge available, it is possible to forecast the passengers’ demand which is fundamental to optimize the allocation of resources (personnel and vehicles), the network planning, the frequency setting and therefore to reduce operating costs. Granda Bus consortium provides about 10 million smart card validations referring to the whole year 2019 in the Piedmont area, in the North-West of Italy. The study, conducted at the Links Foundation, exploits these data to answer the following research question: "What is the estimated public transport demand at one bus stop for a selected route, given a specific day and time slot?". To address this unknown, a methodology has been designed, developing the whole KDD process. It opens with a preliminary data analysis useful to understand the quality and the integrity of the data and to identify the best way to process them. The couple bus stop-route is the core element of the analysis: this choice is justified by the fact that the bus stops can have several routes, each one characterized by its own target of users and therefore trend of validations which differs significantly one from the other. A clustering process has been applied to all the couples bus stop-route, based on the number of validations and importance of the offer to detect a set of representative couples. The study focuses on them and compares the performance of the different selected machine learning techniques. In particular, the predictive models selected to conduct the analysis are: Average and Median Response, Random Forest Regressor, Gradient Boosted Decision Tree, Support Vector Regression and SARIMA. The obtained results show that a temporal segmentation is needed, since the validations trend changes according to the period of the year, in correlation with the schools opening or closing. For each segment and cluster, the best machine learning model has been identified.

Relators: Silvia Anna Chiusano, Elena Daraio
Academic year: 2021/22
Publication type: Electronic
Number of Pages: 79
Corso di laurea: Corso di laurea magistrale in Ict For Smart Societies (Ict Per La Società Del Futuro)
Classe di laurea: New organization > Master science > LM-27 - TELECOMMUNICATIONS ENGINEERING
URI: http://webthesis.biblio.polito.it/id/eprint/20414
Modify record (reserved for operators) Modify record (reserved for operators)