Politecnico di Torino (logo)

Real-time flow monitoring based on time-space probabilistic data structures

Simone Porcino

Real-time flow monitoring based on time-space probabilistic data structures.

Rel. Paolo Giaccone, Alessandro Cornacchia. Politecnico di Torino, Corso di laurea magistrale in Communications And Computer Networks Engineering (Ingegneria Telematica E Delle Comunicazioni), 2023

PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (1MB) | Preview

Network monitoring is essential to ensure network performance, integrity and security. However, with the growth of link rates and simultaneous number of flows, real-time traffic monitoring is challenging because it requires network devices that perform operations on a time scale very reduced and, possibly, with a small amount of memory to store measurement data. Among the available metrics to asses network performance, this thesis refers to the round-trip time (RTT) of data flows, i.e. packets sharing the same flowID. To comply with the aforementioned constraints, the measurement device should be equipped with a very efficient data structure both for recording the time each outgoing packet enters the system and for querying it at the reception of its response to calculate their RTT. For this purpose, we propose a probabilistic data structure composed of a set of Bloom filter that are evenly spaced temporally. When an outgoing packet enters the system, it is stored within a specific Bloom filter based on the time interval it is covering at that precise moment. However, once a packet is stored, any information about its real arrival time in the system is lost, due to properties of probabilistic data structures. Then, when an incoming packet enters the system, to calculate its flow RTT, it is necessary to perform a lookup operation in each Bloom filter to identify the one where the associated outgoing packet may have been stored. During this phase, two aspects affect the RTT flow calculation: (1) the associated outgoing packet may end up being stored in several Bloom filters (called \textit{candidates}) due to the false positive event that any probabilistic data structure suffers from, and (2) the arrival time of the associated outgoing packet into the system is unknown. To address the first problem, a Bloom filter must be selected among the candidates according to some algorithm, while the second problem is solved by associating a reference time to each time slot (e.g. its intermediate value), which will be used as hypothetical arrival time for any packet stored inside it. Anyway, both problems strictly depend on the amount of memory of the data structure and how this is divided among the Bloom filters: for the same amount of memory and to cover the same time window, a large number of Bloom filters translates into a high granularity of time slots, therefore in a more accurate estimation of the RTT, since the distance between the real arrival time of outgoing packets in the system and their hypothetical arrival times is reduced; on the other hand, the same choice will imply having Bloom filters with low memory and high probability of false positive, which will results in more candidates to choose from and, consequently, an increase in the probability of selecting the wrong one. Hence, the aim of this thesis is to evaluate the accuracy of the system in measuring the flow RTT by varying the data structure memory and the time slot width.

Relators: Paolo Giaccone, Alessandro Cornacchia
Academic year: 2022/23
Publication type: Electronic
Number of Pages: 72
Corso di laurea: Corso di laurea magistrale in Communications And Computer Networks Engineering (Ingegneria Telematica E Delle Comunicazioni)
Classe di laurea: New organization > Master science > LM-27 - TELECOMMUNICATIONS ENGINEERING
Aziende collaboratrici: UNSPECIFIED
URI: http://webthesis.biblio.polito.it/id/eprint/26770
Modify record (reserved for operators) Modify record (reserved for operators)