Ali Mousazadeh
Implementing a neural network solution for predicting estimated time of arrival.
Rel. Danilo Giordano. Politecnico di Torino, Corso di laurea magistrale in Data Science And Engineering, 2025
| Abstract: |
Estimating the time of arrival (ETA) for vehicular travel is central to modern transportation and logistics systems. This thesis presents an end to-end pipeline for processing raw GPS data, map matching trips with Valhalla, and training a deep learning model capable of accurately predicting ETAs based on historical information alone. The scope covers an extensive dataset drawn from a major metropolitan region (Paris), capturing over 2 million trips in 2023 for training/validation and an additional 400,000 trips in 2024 for final testing. To address data quality issues, robust preprocessing filters were applied to eliminate unstructured or anomalous sessions. Trip boundaries were defined by time gaps and idle periods, and each trip underwent map matching to align latitude/longitude points with OpenStreetMap road segments. Additional constraints (e.g., bounding box, mileage and duration ranges, speed thresholds) narrowed the dataset to a clean, realistic subset. Features include road-level attributes such as way ID, road type, segment length, and speed limit, as well as higher-level contextual factors like day of week, month, hour, and trip mileage. These were incorporated into a neural architecture that merges an LSTM (to process sequential road segments) with a trip-level embedding layer. Each road segment’s estimated travel time is summed to produce the final ETA. By designing the model in-house, all data remains local, protecting privacy and ensuring scalability when handling millions of trips. Model training uses the Adam optimizer with mean absolute error (MAE) as the loss function, batch normalization, weight decay, and early stopping to balance convergence speed and overfitting. Despite possessing up to 10 million trips, experiments showed that 2 million sufficed for strong generalization under these hyperparameters. Evaluations on 2024 data demonstrated that the proposed approach consistently outperforms a baseline Valhalla routing estimate across a range of trip durations and mileages, with median absolute errors often halved. These findings underline the efficacy of combining road-segment embeddings, time-based contextual data, and sequential modeling to capture vehicle travel patterns without real-time traffic feeds. The framework paves the way for deeper analysis of driver behavior and traffic flows, offering insights into both common and unusual driving conditions. Future work may incorporate partial real-time signals, multi-GPU training, or more advanced architectures to further enhance performance. |
|---|---|
| Relatori: | Danilo Giordano |
| Anno accademico: | 2025/26 |
| Tipo di pubblicazione: | Elettronica |
| Numero di pagine: | 55 |
| Informazioni aggiuntive: | Tesi secretata. Fulltext non presente |
| Soggetti: | |
| Corso di laurea: | Corso di laurea magistrale in Data Science And Engineering |
| Classe di laurea: | Nuovo ordinamento > Laurea magistrale > LM-32 - INGEGNERIA INFORMATICA |
| Aziende collaboratrici: | STELLANTIS EUROPE SPA |
| URI: | http://webthesis.biblio.polito.it/id/eprint/37874 |
![]() |
Modifica (riservato agli operatori) |



Licenza Creative Commons - Attribuzione 3.0 Italia