polito.it
Politecnico di Torino (logo)

Automatic Detection of Coordinated Events in Darknet Traffic

Luca Gioacchini

Automatic Detection of Coordinated Events in Darknet Traffic.

Rel. Luca Vassio, Francesca Soro, Idilio Drago. Politecnico di Torino, Corso di laurea magistrale in Ict For Smart Societies (Ict Per La Società Del Futuro), 2021

[img]
Preview
PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (2MB) | Preview
Abstract:

Darknets are network monitoring tools composed by sets of IP addresses announced in routing protocols, but without hosting any services. They constantly listen to incoming traffic and record it. The received packets are thus unsolicited and represent a privileged source of information for network security. Indeed, the lack of any production traffic in darknet makes it easier to detect possible threats like internal scans, brute-force attempts against services, etc. Detecting and evaluating coordinated events is an important step to fully exploit the darknet monitoring potential. Indeed it could reduce the amount of data to be evaluated by security analysts and provide a richer picture about ongoing attacks on the Internet. Given the huge amount of source IPs constantly targeting darknets, a manual analysis on the received traffic is impractical. Moreover, there is a lack of comprehensive ground truth that could be used to learn traffic patterns. In this thesis I evaluate the use of different methodologies based on unsupervised data mining for automatically detecting coordination among groups of source IPs contacting darknets. I investigate two hypotheses that could characterize the coordination: i) as sources sending traffic are usually controlled by a single entity, the level of traffic activity reaching a darknet from coordinated hosts should be similar. Thus, I study whether traffic intensity could indicate coordination; ii) traffic from coordinated sources should reach the darknet with some temporal correlation. Thus, I study whether coordinated sources are observed in the darknet simultaneously. I develop a complete machine learning pipeline to test these hypotheses. Considering the traffic intensity case, I design and evaluate a set of features that could represent traffic intensity and study unsupervised algorithms to group IPs based on the engineered features. For studying the temporal relationship among sources, I employ the word2vec algorithm, an approach used in text processing to find words that frequently occur nearby in sentences and documents. To overcome the lack of general ground truth and provide a sound validation of results, I rely on domain knowledge to build a dataset of coordinated IPs. I build up a list of IPs belonging to well-known Internet scanners, such as security search engines whose coordination is clearly visible on the darknet. My results suggest that the generated features allow to highlight both the traffic intensity and temporal coordination. Indeed, when studying the spatial information provided by the used features sets, IP addresses belonging to the same ground truth class belongs to the same neighborhood with an average accuracy of 89%. Furthermore, the unsupervised algorithms are able to group together IP addresses exhibiting similar behaviors within a day of darknet traffic, but the cluster membership is not maintained over time. I observe that clusters built over different days of traffic are significantly dissimilar (0.5 of adjusted mutual information), which seems to be explained by changes in behavior of the whole set of coordinated IP addresses. All in all, my results show that the approaches can reduce the amount of data points to be analyzed by security analysts, putting together IP addresses that share a common behavior. Yet, algorithms must be applied to short time ranges to maximize the chances that real coordination is identified.

Relatori: Luca Vassio, Francesca Soro, Idilio Drago
Anno accademico: 2020/21
Tipo di pubblicazione: Elettronica
Numero di pagine: 90
Soggetti:
Corso di laurea: Corso di laurea magistrale in Ict For Smart Societies (Ict Per La Società Del Futuro)
Classe di laurea: Nuovo ordinamento > Laurea magistrale > LM-27 - INGEGNERIA DELLE TELECOMUNICAZIONI
Aziende collaboratrici: NON SPECIFICATO
URI: http://webthesis.biblio.polito.it/id/eprint/18007
Modifica (riservato agli operatori) Modifica (riservato agli operatori)