polito.it
Politecnico di Torino (logo)

Word Embedding applications for Anomaly Detection in financial data

Stefano Chiartano

Word Embedding applications for Anomaly Detection in financial data.

Rel. Flavio Giobergia, Elena Maria Baralis, Danilo Giordano. Politecnico di Torino, Corso di laurea magistrale in Data Science And Engineering, 2024

Abstract:

Money laundering, the process of disguising the origins of illegally obtained funds, poses a significant threat to global financial systems. Anti-Money Laundering (AML) regulations aim to prevent, detect, and report financial crimes and money laundering activities. These measures help prevent the financing of terrorism, drug trafficking, and other criminal organizations. The scale of this issue is substantial: the United Nations Office on Drugs and Crime (UNODC) estimates that 2-5\% of global GDP (Gross Domestic Product) is laundered globally. This highlights the importance for banks and financial institutions to actively fight against money laundering. However, traditional AML methodologies, which mainly consist of rule-based systems, are inadequate for the continuously evolving techniques used for money laundering. In recent years, Machine Learning and Artificial Intelligence have been applied in this field, successfully improving the detection of fraudulent transactions and accounts. In this thesis, we propose an approach to combat money laundering by applying Natural Language Processing (NLP) techniques to transaction data treating financial transactions as words and the sequence of transactions in an account's history as sentences. Using NLP models such as Word2Vec and Doc2Vec, we developed both transaction and account embeddings. Subsequently, we applied different anomaly detection techniques to these embeddings to identify potential money laundering cases. This thesis investigates the effectiveness of transaction and account embeddings for anomaly detection in anti-money laundering efforts. By evaluating our approach on a real-world dataset, we demonstrate its potential in detecting fraudulent cases, providing a new framework for AML systems.

Relatori: Flavio Giobergia, Elena Maria Baralis, Danilo Giordano
Anno accademico: 2024/25
Tipo di pubblicazione: Elettronica
Numero di pagine: 73
Informazioni aggiuntive: Tesi secretata. Fulltext non presente
Soggetti:
Corso di laurea: Corso di laurea magistrale in Data Science And Engineering
Classe di laurea: Nuovo ordinamento > Laurea magistrale > LM-32 - INGEGNERIA INFORMATICA
Aziende collaboratrici: NON SPECIFICATO
URI: http://webthesis.biblio.polito.it/id/eprint/33030
Modifica (riservato agli operatori) Modifica (riservato agli operatori)