Luca Stradiotti
Semi-supervised Tree-based Anomaly Detection.
Rel. Paolo Garza. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2022
|
PDF (Tesi_di_laurea)
- Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives. Download (6MB) | Preview |
Abstract: |
In many real-world applications, abnormal behaviors must be detected immediately to avoid dangerous situations. Several automated approaches have been proposed that aim to analyze the data collection provided and identify critical and dangerous patterns. This task was always considered as unsupervised learning since no labeled instances were available: obtaining training labels is extremely expensive and requires a lot of time from experts who have to carefully read the data and provide the labels. However, nowadays there are often few labels available, so many semi-supervised models have been studied, which significantly improve the unsupervised performance. Semi-supervised models are divided into three categories depending on their approach. One of them is composed of tree-based models that learn how to properly classify anomalies and normal data by building an ensemble of trees. Although these models are very powerful, they are poorly studied in the literature due to the difficulty of using both unlabeled and labeled information during the tree-construction phase. Therefore, a novel semi-supervised tree-based approach is proposed in this work. The model learns from both the available labeled instances and unlabeled data to intelligently partition the space into regions to distinguish normal samples from outliers. The model is then evaluated on several benchmark datasets and its performance is compared with available state-of-the-art algorithms. Empirically the obtained results show that the proposed approach outperforms the unsupervised and semi-supervised baselines for most of the datasets used. |
---|---|
Relators: | Paolo Garza |
Academic year: | 2022/23 |
Publication type: | Electronic |
Number of Pages: | 55 |
Subjects: | |
Corso di laurea: | Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering) |
Classe di laurea: | New organization > Master science > LM-32 - COMPUTER SYSTEMS ENGINEERING |
Ente in cotutela: | KUL - KATHOLIEKE UNIVERSITEIT LEUVEN (BELGIO) |
Aziende collaboratrici: | Ku Leuven |
URI: | http://webthesis.biblio.polito.it/id/eprint/24688 |
Modify record (reserved for operators) |