Politecnico di Torino (logo)

An autoencoder-based clustering strategy for usage pattern detection on heavy duty’s vehicles’ CAN bus data

Ruggiero Francavilla

An autoencoder-based clustering strategy for usage pattern detection on heavy duty’s vehicles’ CAN bus data.

Rel. Francesco Vaccarino, Luca Cagliero, Silvia Buccafusco. Politecnico di Torino, Corso di laurea magistrale in Data Science And Engineering, 2021

PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (3MB) | Preview

This thesis work addresses a real case problem that concern heavy duty’s vehicle and patterns. Due to the diffusion of IoT devices and the establishing of cars connected mobility in firms, the connectivity of heavy-duty in-vehicle is becoming a more and more important task in pattern identification tasks. This work is developed in Tierra S.p.A., a company that creates innovative solutions in advanced telematics and IoT fields and that is part of the collaboration between the applied research and data analytics department and Politecnico di Torino. The purpose of this work is the identification of pattern thresholds in heavy duty’s vehicles, due to clients’ failures in manual detection. For this task, multivariate time series data analysis with an innovative autoencoder-based technique is presented. After a first exploration of signals, and the identification of the most relevant ones for the task, a proper combination of series with different sampling rates is performed. Then, for data preparation purposes, a segmentation strategy for the multivariate time series based on VALMOD algorithm, is proposed. Moreover, a combined action of autoencoder models and clustering techniques is used for the pattern identification task. In particular, this approach looks at finding patterns in data, by exploiting the reconstruction ability of signals of autoencoder models. Therefore, different usage patterns are identified with the application of a clustering technique on a customized dissimilarity matrix based on autoencoders reconstruction error. Finally, a score-based strategy helps at identifying thresholds between usage patterns. As last step, after a manual validation with the help of experts, with visualization tools, a silhouette-based approach analyzing clustering results coming from different distance measures is described analyzing both separability and cohesion of groups, and the usage patterns detected.

Relators: Francesco Vaccarino, Luca Cagliero, Silvia Buccafusco
Academic year: 2021/22
Publication type: Electronic
Number of Pages: 78
Corso di laurea: Corso di laurea magistrale in Data Science And Engineering
Classe di laurea: New organization > Master science > LM-32 - COMPUTER SYSTEMS ENGINEERING
Aziende collaboratrici: Tierra spa
URI: http://webthesis.biblio.polito.it/id/eprint/20476
Modify record (reserved for operators) Modify record (reserved for operators)