polito.it
Politecnico di Torino (logo)

Unsupervised Machine Learning for Mining Alarm Logs of a Large Telecommunication Network

Golnazsadat Zargarian

Unsupervised Machine Learning for Mining Alarm Logs of a Large Telecommunication Network.

Rel. Marco Mellia. Politecnico di Torino, Corso di laurea magistrale in Communications And Computer Networks Engineering (Ingegneria Telematica E Delle Comunicazioni), 2018

[img]
Preview
PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (9MB) | Preview
Abstract:

Alarm log plays a crucial role in the network area as a valuable source of information for anomaly detection and reporting network failures. Manual analysis of such logs is time-consuming and costly because they involve an extensive amount of data. On the other hand, the automatic detection of useful information can be also quite challenging. As a result, finding suitable methods to process these logs in a proper way is a well-established problem in the network analysis area. In this research, we propose unsupervised machine learning techniques to mine data logs and thus provide meaningful information about possible causes and cascade effect in a network failure. The available data is extracted from TIM Network Operations Center (NOC) which includes the list of whole alarms during specific months for different provinces of Italy. Since most of our interesting features in the dataset are categorical, it is difficult to define a measure of distance for clustering algorithms so we choose association rule mining on frequent items as an alternative approach. In order to use such methods, we recall preliminaries of market basket analysis problem, in which the main objective is to extract actionable knowledge and co-occurrences from the vast features of transactional databases in order to gain competitive advantage. The first step is to look for items that we are interested to study and then define the transactions for them. For using any pattern mining algorithm, we are required to transform the data from its frame format into transactions such that each row corresponds to a transaction whereas each column indicates an item. Defining these matrices require experiments since each method has different results and its own advantages. For choosing items we will focus on each network device to extract specific correlations. We then identify which devices were raising alarms at the same time bin more frequently. We will focus on Turin province to reduce complexity and study two datasets reported in two different months of May and September 2017. We then apply frequent pattern mining methods on this matrix to extract frequent items that are later used to find temporal and spatial co-occurrences. Some temporal correlations among power plants and events are evident. Nevertheless, finding the direct spatial correlation is harder to accomplish because we are not informed about the topological connectivity among plants. We outline the most significant mutual rules which hold true with a high probability in two selected provinces located close to each other (Turin and Milan). We consider the significance of a rule with its measures of interestingness such as lift, support, and confidence. Visualization of these rules together with the knowledge from domain expert is another measure of importance. We will show how observed frequent patterns help us to recognize possible future anomalies as situations appeared in the past and avoid them from happening again. we will conclude the feedback from TIM network maintenance team in Rome who confirmed that rules similar to what we found were already presented in their system. So our automatic rules are useful for their systems. Moreover, TIM will use the rules we extracted as an input of machine learning algorithms to "detect patterns". These rules are stored in the systems as a list of "situations" presented together with meta-data (location, resolution and etc).

Relatori: Marco Mellia
Anno accademico: 2018/19
Tipo di pubblicazione: Elettronica
Numero di pagine: 101
Soggetti:
Corso di laurea: Corso di laurea magistrale in Communications And Computer Networks Engineering (Ingegneria Telematica E Delle Comunicazioni)
Classe di laurea: Nuovo ordinamento > Laurea magistrale > LM-27 - INGEGNERIA DELLE TELECOMUNICAZIONI
Aziende collaboratrici: NON SPECIFICATO
URI: http://webthesis.biblio.polito.it/id/eprint/9017
Modifica (riservato agli operatori) Modifica (riservato agli operatori)