Politecnico di Torino (logo)

Unsupervised Outlier Detection from Financial Transaction Data

Patrizio De Girolamo

Unsupervised Outlier Detection from Financial Transaction Data.

Rel. Luca Cagliero, Marco Mellia, Flavio Giobergia. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2023

[img] PDF (Tesi_di_laurea) - Tesi
Restricted to: Repository staff only until 27 October 2026 (embargo date).
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (3MB)

In an era where Internet usage is increasingly widespread and banking transactions are becoming simpler and more immediate, there is a growing need for financial institutions to develop robust cybersecurity strategies to protect both themselves and their users. Among these strategies, the one known as "Transaction Monitoring" plays a crucial role in the modern financial landscape as it involves analyzing large volumes of banking data to detect anomalous behaviors, thereby preventing potential criminal activities. Transaction monitoring can be modeled as a specific instance of outlier detection, where outliers represent suspicious transactions that need to be identified. This task is challenging on several fronts, firstly due to dataset imbalance, which consists of a clear predominance of normal transactions over anomalous ones. Additionally, given our limited experience in the financial field, the identification of what constitutes anomalous behavior poses another issue. In spite of this, two main approaches for automated outlier detection can be found in the literature: the supervised approach and the unsupervised one. Supervised approaches, although being more effective, demand a large amount of humanly labeled data which is difficult and expensive to obtain. These annotations serve as a guide for algorithms to learn what is anomalous and what is not. On the other hand, unsupervised approaches, despite requiring less human intervention, tend to achieve lower performance due to the lack of human supervision. Nevertheless, they represent the only option when labeled data are unavailable. The absence of such labeled data limits the use of supervised approaches and inhibits the standard evaluation of unsupervised methods. Finally, both approaches require significant hardware and temporal resources for training. This thesis focuses on unsupervised approaches, specifically on implementing, comparing and evaluating some of the methods used in the literature basing on the available resources. Taking into account the above mentioned challenges, the evaluation of such unsupervised methods is carried out by injecting synthetic data. The primary objective of this thesis work is to identify transactions and users flagged as anomalous by the system, thereby contributing significantly to enhancing financial security. However, as part of a larger project, the final objective includes not only outlier detection but also the generation of information to better understand why these data points have been classified as outliers.

Relators: Luca Cagliero, Marco Mellia, Flavio Giobergia
Academic year: 2023/24
Publication type: Electronic
Number of Pages: 76
Corso di laurea: Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering)
Classe di laurea: New organization > Master science > LM-32 - COMPUTER SYSTEMS ENGINEERING
Aziende collaboratrici: UNSPECIFIED
URI: http://webthesis.biblio.polito.it/id/eprint/28660
Modify record (reserved for operators) Modify record (reserved for operators)