Log Anomaly Detection with Graph-Text Contrastive Learning

Xiao Tan

Log Anomaly Detection with Graph-Text Contrastive Learning.

Rel. Piero Boccardo, Francesca Matrone. Politecnico di Torino, Corso di laurea magistrale in Digital Skills For Sustainable Societal Transitions, 2025

Abstract:	Log analysis is a critical technique for diagnosing issues in large-scale moderncomputing systems. Over the past decades, numerous deep learning-based log analysis approaches have been developed to detect system anomalies reflected in log data. Anomalies in logs generally fall into two categories: event-level semantic anomalies and structural anomalies. Event-level semantic anomalies occur within the textual content of individual log events, while structural anomalies arise from violations of quantitative relational patterns or sequential dependencies in log event sequences. Existing log anomaly detection methods can be broadly categorized into Large Language Models (LLMs) and Graph Neural Networks (GNNs) approaches. GNNs excel at detecting structural anomalies by capturing spatial structural relationships, while LLMs are particularly effective at identifying event-level semantic anomalies due to their strong contextual understanding ability. However, these methods often struggle to effectively leverage the complementary strengths of spatial structural information and deep semantic features inherent in log events, resulting in underutilized data. To overcome these challenges, this study introduces a novel Graph-Text Contrastive Learning (GTCL) Log Anomaly Detection Method. GTCL leverages a GNN-based graph encoder to effectively capture structural relationships and an LLM-based text encoder to extract deep semantic features. In addition, to address the heterogeneity between modalities, we introduce the Distance Loss regularization term, extending the model to GTCL-DL, which aims to increase the distance between the shared and modality-specific components of the embeddings. These proposed methods are evalu??ated on two widely used public log datasets, with experimental results demonstrating that they outperform baseline GNN-based and LLM-based approaches. Moreover, it exhibits stable performance across different window size settings, highlighting its effectiveness and robustness in log-based anomaly detection.
Relatori:	Piero Boccardo, Francesca Matrone
Anno accademico:	2024/25
Tipo di pubblicazione:	Elettronica
Numero di pagine:	68
Informazioni aggiuntive:	Tesi secretata. Fulltext non presente
Soggetti:
Corso di laurea:	Corso di laurea magistrale in Digital Skills For Sustainable Societal Transitions
Classe di laurea:	Nuovo ordinamento > Laurea magistrale > LM-91 - TECNICHE E METODI PER LA SOCIETÀ DELL'INFORMAZIONE
Ente in cotutela:	Zepeng Zhang (SVIZZERA)
Aziende collaboratrici:	Politecnico di Torino
URI:	http://webthesis.biblio.polito.it/id/eprint/34440

Modifica (riservato agli operatori)