polito.it
Politecnico di Torino (logo)

A greedy generalization algorithm for anonymizing Origin-Destination matrices

Pietro Armenante

A greedy generalization algorithm for anonymizing Origin-Destination matrices.

Rel. Luca Vassio, Nikhil Jha, Marco Mellia. Politecnico di Torino, NON SPECIFICATO, 2025

[img] PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (5MB)
Abstract:

Mobility data, describing the locations and movements of individuals within a geographic area, are a key resource for analysing and managing transportation systems. These data are frequently summarised in the form of origin–destination (OD) matrices, where each entry represents the number of trips between a specific origin and a specific destination. While OD-matrices are a valuable tool for modelling travel demand and managing transport networks, they also raise important privacy concerns, as they may allow the identification of individuals or sensitive travel patterns when published at high spatial resolution. In this work, a k-anonymisation algorithm specifically designed for OD-matrices is presented: the ODkAnon algorithm. Unlike traditional approaches that apply uniform spatial generalisation to both origin and destination cells, this method dynamically determines, for each flow that does not meet the k-anonymity threshold, whether to generalise the origin or the destination. This selective strategy aims to minimise the loss of spatial precision while ensuring that all flows satisfy the k-anonymity requirement. ODkAnon algorithm creates homogeneous geographical areas. This means that the final OD matrix will comprise non-overlapping areas. The proposed approach is applied to three real-world mobility datasets: GPS trajectories from the NetMob25 challenge in Paris (France), car-sharing trips in Turin (Italy), and one year of taxi trips in Porto (Portugal). To assess the performance of the proposed method, its results are compared with those obtained from three well-established algorithms in the literature (Mondrian, ATG, and OIGH), evaluating each approach in terms of both privacy protection and data utility. ATG is creating overlapping hexagons, while OIGH is homogeneous, but it is also uniform: it is not creating hexagons of different sizes, but it is giving a unique size to each hexagon. Experimental results show that ODkAnon achieves competitive performance compared to existing methods. OIGH is the fastest, followed by ATG and then ODkAnon (while Mondrian’s runtime varies by dataset). Unlike ATG, ODkAnon automatically balances generalisation between origins and destinations without parameter tuning. In terms of utility, ODkAnon is generally comparable to ATG, in some cases even better, and consistently superior to OIGH. Mondrian often provides higher utility, but this comes at the cost of not enforcing hierarchical consistency or homogeneity in the resulting partitions. Moreover, using the Paris dataset, the ODkAnon algorithm has also been used to study and create different OD matrices both for the participants and for the population. Protecting the population produces a different anonymization. Even more interesting, when the protection is applied to the population, the participants’ OD matrix loses k-anonymity. The viceversa is also true. Moreover, it is possible to observe amplified differences when segmenting the population by sex, age and work. Consider the example of sex (men and women), with the same k thresholds. The two segments have a similar total number of trips. When protecting the participants, it is shown that protecting men is more challenging, as it requires very coarse hexagons, while for women the resulting hexagons remain much finer. Furthermore, when applying protection to the population, the difference becomes even more pronounced.

Relatori: Luca Vassio, Nikhil Jha, Marco Mellia
Anno accademico: 2025/26
Tipo di pubblicazione: Elettronica
Numero di pagine: 93
Soggetti:
Corso di laurea: NON SPECIFICATO
Classe di laurea: Nuovo ordinamento > Laurea magistrale > LM-32 - INGEGNERIA INFORMATICA
Aziende collaboratrici: NON SPECIFICATO
URI: http://webthesis.biblio.polito.it/id/eprint/37902
Modifica (riservato agli operatori) Modifica (riservato agli operatori)