Malware Family Classification with Semi-Supervised Learning

Maria Letizia Colangelo

Malware Family Classification with Semi-Supervised Learning.

Rel. Antonio Lioy, Andrea Atzeni. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2023

Preview

PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.
Download (3MB) | Preview

Abstract

In recent years, the spread of malware has increased exponentially, posing a significant challenge for cybersecurity experts. When facing with the constantly evolving world of unknown threats, including zero-day attacks, traditional signature-based approaches for malware detection have proven to be insufficient. Furthermore, adversaries are adapting by modifying their malicious code, which reduces the efficacy of signature-based detection. As a solution to these problems, machine learning models have been used to develop behaviour-based malware detection systems, because of their ability to generalise from data and detect previously unseen malware. These systems are employed to inspect the code in order to identify any malicious or potentially harmful actions performed by that code.

Supervised learning shows promising results in detecting malicious code, but it is significantly limited by the considerable amount of manual effort required for labelling both malware and benign instances