polito.it
Politecnico di Torino (logo)

Machine Learning Regression Model For Crowd-Monitoring Through WiFi Probe-Request Analysis.

Roon Mullaaliu

Machine Learning Regression Model For Crowd-Monitoring Through WiFi Probe-Request Analysis.

Rel. Claudio Ettore Casetti, Paolo Giaccone. Politecnico di Torino, Corso di laurea magistrale in Data Science And Engineering, 2024

[img]
Preview
PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (2MB) | Preview
Abstract:

In current times, the proliferation of smart and IoT devices is generating vast amounts of data, presenting new challenges for leveraging this information to enhance efficiency, drive innovation, and improve decision-making. One significant application is crowd monitoring, which is becoming crucial in urban environments by improving public safety, optimizing traffic flow, and enhancing the management of events and public spaces. The analysis of wireless network traces has emerged as an effective method for real-time crowd size estimation and movement pattern analysis. WiFi Probe Requests have proven particularly valuable for providing these real-time insights and behaviors. The primary focus of this thesis is on estimating the number of people by processing and analyzing captured WiFi Probe Request messages. Given that modern operating systems randomize MAC addresses for privacy reasons, people counting has become a challenging task. To address this, a machine learning framework has been developed and presented in this thesis. This framework utilizes a regression model, based on sophisticated neural networks, to provide numerical estimations of people counts. All aspects of the machine learning framework implementation have been considered. This includes a valuable preprocessing phase, that filters and extracts relevant features from the captured data to use in the model. The framework also defines datasets for model training and evaluation. For proper training, vast amounts of data from a realistic software simulator and real capture data have been used, where augmentation techniques have been employed to extend the dataset size, making it suitable for machine learning training. The quality of the regression model in estimation of crowd size was initially tested on traces generated by the software simulator, showing excellent results in a synthetic environment, and demonstrating the validity of the approach. Subsequently, the framework was also tested to predict real captures. Tests were conducted with training performed first on simulated traces and then on real traces. The results were promising in both cases, particularly for the model trained with real data. The model parameters have been rigorously fine-tuned to achieve optimal performance. The most significant results were obtained using a model based on Long Short-Term Memory (LSTM). By analyzing temporally consecutive captures and identifying temporal patterns within them, the LSTM network significantly improved the results obtained by a simple Fully Connected Neural Network. The proposed regression approach incorporates the results of another people counting model, based on clustering, by including its results as features. This allowed for the acquisition and enhancement of positive results demonstrated by the clustering approach. The findings of this research have practical implications across a variety of domains. Its most crucial application lies in improving safety in public environments, pedestrian traffic management, and emergencies, by predicting congestion areas in real-time and enabling prompt action. Additionally, this thesis addresses privacy concerns related to processing and storing WiFi Probe Requests, particularly the handling of MAC addresses, which are considered personal data and therefore subject to the regulations under the EU GDPR. It illustrates the problems with current solutions and presents more efficient alternatives based on advanced privacy paradigms like Bloom filters and differential privacy.

Relatori: Claudio Ettore Casetti, Paolo Giaccone
Anno accademico: 2023/24
Tipo di pubblicazione: Elettronica
Numero di pagine: 91
Soggetti:
Corso di laurea: Corso di laurea magistrale in Data Science And Engineering
Classe di laurea: Nuovo ordinamento > Laurea magistrale > LM-32 - INGEGNERIA INFORMATICA
Aziende collaboratrici: NON SPECIFICATO
URI: http://webthesis.biblio.polito.it/id/eprint/31893
Modifica (riservato agli operatori) Modifica (riservato agli operatori)