Machine Learning for malware characterization and identification

Marco Saracino

Machine Learning for malware characterization and identification.

Rel. Antonio Lioy, Andrea Atzeni. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2023

Preview	PDF (Tesi_di_laurea) - Tesi Licenza: Creative Commons Attribution Non-commercial No Derivatives. Download (4MB) \| Preview
	Archive (ZIP) (Documenti_allegati) - Altro Licenza: Creative Commons Attribution Non-commercial No Derivatives. Download (135MB)

Abstract

Nowadays, one of the most important threats that needs to be addressed is malware. Malicious programs have evolved over time, becoming more numerous and complex. Zero-day malwares are the new malware that are already widespread on the Internet but have not yet been identified. Traditional signature-based malware detection systems fail to detect these new malicious files because they have not yet been analyzed, so the systems will not have a valid signature with which to identify them and will cause false negatives when placed under examination. To identify and classify malware without the need of the malware signatures, I tried using different machine learning techniques to understand which algorithm was best suited for the task.

First, datasets were sought that were suitable for my task, and then the available malware had to be analyzed to see what features could be extracted