polito.it
Politecnico di Torino (logo)

Enhancing Vehicle Crash Detection through Multivariate Time Series Data Augmentation

Natalia Lebedeva

Enhancing Vehicle Crash Detection through Multivariate Time Series Data Augmentation.

Rel. Paolo Garza. Politecnico di Torino, Corso di laurea magistrale in Data Science And Engineering, 2025

Abstract:

Data-driven decision-making systems have a significant impact in the modern world, to the extent of directly contributing to saving human lives. One example of such application is the Crash Detector project developed at Generali Italia. It is a machine learning solution designed to identify vehicle collisions from telematics data collected by car blackboxes. Detecting crashes in real time enables rapid assistance from designated services to those in need. A key challenge for this project is the high imbalance in the available data, where true crash events are extremely rare among all the recorded samples. Such scarsity of positive cases makes it difficult for the model to learn meaningful patterns. As an outcome, predictions can be biased towards non-crash events, while the cost of an error could be a driver's or a passenger's life. To address the class imbalance issue, the present work explores the application of time series data augmentation with the purpose of improving model's performance and stability. The study includes a systematic review of data augmentation strategies, from classical signal-based transformations to advanced generative approaches such as Generative Adversarial Networks (GANs), Variational Autoencoders (VAEs), and diffusion models. It also examines methods for assessing synthetic data. Selected techniques are implemented within the Crash Detector project to enrich the available telematics data used for model training. The quality of the generated data is evaluated in terms of data fidelity, diversity, and representativeness. In addition, comparative experiments are carried out with the existing deep neural network to quantify improvements in detection precision and recall. The results aim to demonstrate that a carefully designed augmentation and evaluation framework can effectively mitigate data imbalance. The outcomes of this study may support the development of more reliable models for telematics applications.

Relatori: Paolo Garza
Anno accademico: 2025/26
Tipo di pubblicazione: Elettronica
Numero di pagine: 75
Informazioni aggiuntive: Tesi secretata. Fulltext non presente
Soggetti:
Corso di laurea: Corso di laurea magistrale in Data Science And Engineering
Classe di laurea: Nuovo ordinamento > Laurea magistrale > LM-32 - INGEGNERIA INFORMATICA
Aziende collaboratrici: Generali Italia S.p.A.
URI: http://webthesis.biblio.polito.it/id/eprint/38755
Modifica (riservato agli operatori) Modifica (riservato agli operatori)