polito.it
Politecnico di Torino (logo)

Matrix Profile meets Contrastive Learning: A Novel approach to Time series Anomaly Detection

Alessandro Gelsi

Matrix Profile meets Contrastive Learning: A Novel approach to Time series Anomaly Detection.

Rel. Luca Cagliero, Jacopo Fior. Politecnico di Torino, NON SPECIFICATO, 2024

[img] PDF (Tesi_di_laurea) - Tesi
Accesso riservato a: Solo utenti staff fino al 11 Aprile 2025 (data di embargo).
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (10MB)
Abstract:

This thesis work proposes a novel approach to analyze time series anomaly detection. Time series consist of sequences of data points indexed in time order and play an important role in various industries, such as finance, healthcare, and energy production. The recent technological development has led to a huge amount of data in several fields, making human manual analysis impossible. Traditional statistical methods often assume that data come from specific mathematical models, struggling to scale to large datasets. In contrast, machine learning approaches treat data generation as a black box, relying on algorithms to identify patterns without explicitly modeling the generation process. While this approach offers scalability and flexibility, it often requires large labeled datasets and expert knowledge, which makes it challenging and resource-intensive. Anomaly detection refers to the identification of data entries that stand out as unusual within a dataset. While the specific definition of an anomaly can vary depending on the task and on the domain, it generally refers to instances that significantly deviate from the others. The proposed approach is based on the integration of Matrix Profile concept and a Contrastive representation learning technique. The study introduces DAMP (Discord Aware Matrix Profile), a deterministic method that outputs a data structure called matrix profile (MP). This last contains similarity values between subsequences and can be used for traditional anomaly detection. This work also explores contrastive representation learning for time series, specifically focusing on the TNC model. Contrastive approach is based on the idea of learning by comparison and training models to distinguish between similar positive and negative pairs of data sequences. In this way it is possible to learn meaningful representations of data, capturing the inherent temporal dynamics. The innovative intuition consists in the formulation of three families of novel loss functions, able to manage MP information during the training process of the contrastive model. The first approach is referred as Hybrid Discord-penalty loss and it is based on the idea of giving a contribution to TNC that directly depends on MP. The second, referred as Contrastive Discord-aware loss, is based on exploiting MP to affect the definition of what are positive and negative pairs in the contrastive approach. The third, referred as Discord-discriminating loss, works leveraging MP structure as a new feature of the time series data, managing it in parallel on a different branch. At first, an extensive experimentation is conducted on the DAMP alone, followed by testing the TNC and the proposed integrations. Experimental results, conducted on three datasets, demonstrate the effectiveness of these contributions. Hybrid Discord-penalty loss shows variable but positive performance enhancements. Contrastive Discord-aware loss presents a context-sensitive efficacy. Discord-discriminating loss stands out for its positive impact marking it as a highly effective approach. Altogether, the thesis argues that the combination of Matrix Profile and Contrastive representation learning can offer a significant enhancement in anomaly detection, providing more accurate and robust results. The findings have broad implications for sectors reliant on time series data, paving the way for an integration on future and more powerful models.

Relatori: Luca Cagliero, Jacopo Fior
Anno accademico: 2023/24
Tipo di pubblicazione: Elettronica
Numero di pagine: 154
Soggetti:
Corso di laurea: NON SPECIFICATO
Classe di laurea: Nuovo ordinamento > Laurea magistrale > LM-32 - INGEGNERIA INFORMATICA
Aziende collaboratrici: NON SPECIFICATO
URI: http://webthesis.biblio.polito.it/id/eprint/31087
Modifica (riservato agli operatori) Modifica (riservato agli operatori)