Politecnico di Torino (logo)

Machine learning methodologies for QoE prediction in satellite networks

Andrea Di Domenico

Machine learning methodologies for QoE prediction in satellite networks.

Rel. Marco Mellia, Danilo Giordano. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2023

PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (9MB) | Preview

Satellite Communication (SatCom) enables Internet access in remote locations where conventional infrastructure is unavailable or too expensive to be deployed. In a SatCom connection, a customer uses a parabola to connect to a satellite, which sends all of the customer’s traffic to a ground station, which relays the traffic to the Internet. Unlike traditional infrastructures, where the latency to retrieve content is on the order of tens of ms, SatCom’s latency is much higher: ∼550ms for the satellite link to reach the ground station, plus the time it takes for the ground link to reach the content. In this scenario, the Quality of Experience (QoE) of customers is significantly impacted by the SatCom connection, as slowdowns in the satellite or ground link can severely impair the QoE of customers. In order to identify and investigate such impairments, it is of utmost importance for the SatCom operator to build models to estimate the customers’ QoE leveraging only network flows generated by the customers. These models are really useful for the Internet Service Provider (ISP) because in case of poor QoE it can take action with the purpose of increasing customer satisfaction. Thus, the goal of this thesis is to develop a method for assessing customer QoE while browsing the web. I develop a machine learning model to estimate the Speed Index or the onLoad, two of the metrics used to measure Web QoE, from data flowing through the operator’s network. This data is collected by Tstat, a custom flow monitoring software installed in the operator’s ground station. Tstat captures all IP packets and groups them into flows to track the evolution of TCP and UDP flows. The proposed work focuses on different parts: (i) to investigate metrics for measuring QoE in web browsing and tools for automatic benchmark (ii) to create a testbed for automatic collection of network and QoE data in web visits, (iii) to investigate the state of the art in machine learning techniques for regression and classification (iv) the creation of models to investigate QoE prediction capabilities in a real scenario. The final dataset is created by merging two datasets: one active, collected on the client side, which contains the label (the metric used to measure Web QoE) and other information such as website and URL, and a second one passive, collected by Tstat on the ISP ground station, which contains the intercepted network flows. Relevant features are extracted from these data and used as predictors for the machine learning models. The work considers both regression models, such as linear regression or random forest regression and classification models, such as support vector machines or random forest classification. Although the problem can intuitively be viewed as a regression, I can also formulate it as a classification problem, e.g., by defining categories for QoE: good, medium, and poor. The proposed model achieves promising performance in predicting Web QoE metrics on independent test sets. By further investigating aspects of real-world visits, such as the presence of local cache, and simulating typical browser visits, the performance decreases.

Relators: Marco Mellia, Danilo Giordano
Academic year: 2022/23
Publication type: Electronic
Number of Pages: 86
Corso di laurea: Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering)
Classe di laurea: New organization > Master science > LM-32 - COMPUTER SYSTEMS ENGINEERING
Aziende collaboratrici: Politecnico di Torino- SmartData@PoliTo
URI: http://webthesis.biblio.polito.it/id/eprint/26747
Modify record (reserved for operators) Modify record (reserved for operators)