Politecnico di Torino (logo)

Outlier detection to detect segment transitions between time series data

Ilaria Zerbini

Outlier detection to detect segment transitions between time series data.

Rel. Luca Cagliero. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Matematica, 2023

PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (3MB) | Preview

Time series data, prevalent across scientific studies, captures the temporal evolution of phenomena such as sports activities, weather patterns, and health conditions. The fundamental task of classifying these data hinges on identifying statistical properties that can be used to predict labels. This becomes particularly challenging with long, multi-dimensional time series, leading to the necessity of a previous step, in which the segmentation into sub-series is done. This research focuses on recognizing transition between segments characterized by homogeneous trends, a critical aspect of segmentation. Transition points are often ambiguous, situated at the juncture of two states, and identifying them poses unique segmentation challenges. In the following pages, we will address this task by exploring supervised and semisupervised outlier techniques, exploring the new task at hand. Our investigation will begin by establishing the performance of supervised methods, introducing a new tree ensemble method, coupled with a novel sampling strategy. Then, for the transition recognition, we will leverage Time2State, an unsupervised state-ofthe-art algorithm born for segmentation. This will be expanded by integrating an unsupervised outlier detection and a separation score-based sorting criterion, addressing real-world semi-supervised scenarios. Additionally, we will explore data transformation using Shapelets to represent time series data. Thorough testing on ActRecTut and synthetic datasets demonstrates that Time2State alone struggles with precise transition recognition. However, the supervised pipeline performs solidly, and the semi-supervised pipelines outperform the sole Time2State, providing valuable domain insights.

Relators: Luca Cagliero
Academic year: 2023/24
Publication type: Electronic
Number of Pages: 82
Corso di laurea: Corso di laurea magistrale in Ingegneria Matematica
Classe di laurea: New organization > Master science > LM-44 - MATHEMATICAL MODELLING FOR ENGINEERING
Aziende collaboratrici: UNSPECIFIED
URI: http://webthesis.biblio.polito.it/id/eprint/29061
Modify record (reserved for operators) Modify record (reserved for operators)