polito.it
Politecnico di Torino (logo)

Unsupervised anomaly detection on multivariate timeseries in an Oracle database

Davide Di Mauro

Unsupervised anomaly detection on multivariate timeseries in an Oracle database.

Rel. Daniele Apiletti. Politecnico di Torino, Corso di laurea magistrale in Data Science And Engineering, 2023

[img]
Preview
PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (27MB) | Preview
Abstract:

This thesis presents an unsupervised anomaly detection method for multivariate time series in an Oracle database, a relational database management system widely used in various domains. Anomaly detection is a machine learning task that aims to identify patterns in data that deviate from the expected behavior, and has applications in database performance monitoring, fraud detection, and intrusion detection. However, most existing methods require labeled data or manual selection of training data, which are not feasible in a truly unsupervised scenario. The proposed method is based on adversarially trained autoencoders, which are neural networks that learn to compress and reconstruct the input data, and can detect anomalies by measuring the reconstruction error. The proposed application consists of two phases: a training phase, where the autoencoders are trained on normal data selected automatically from the database metrics; and an inference phase, where the model is exposed to new data and anomalies are identified using dynamic thresholds. The method is tested on data collected from different Oracle database instances and considering a variable number of database statistics. The results show that the proposed method can effectively detect multivariate anomalies and can be applied in a production environment thanks to its ability to detect anomalies in near real-time. The main contributions of this thesis are: (1) a novel unsupervised anomaly detection framework for multivariate time series in an Oracle database; (2) a general and scalable architecture for implementing the method; (3) a comprehensive evaluation of the method on real-world data.

Relatori: Daniele Apiletti
Anno accademico: 2022/23
Tipo di pubblicazione: Elettronica
Numero di pagine: 90
Soggetti:
Corso di laurea: Corso di laurea magistrale in Data Science And Engineering
Classe di laurea: Nuovo ordinamento > Laurea magistrale > LM-32 - INGEGNERIA INFORMATICA
Aziende collaboratrici: Mediamente Consulting srl
URI: http://webthesis.biblio.polito.it/id/eprint/27687
Modifica (riservato agli operatori) Modifica (riservato agli operatori)