polito.it
Politecnico di Torino (logo)

Methodology and Measurements of Privacy Mechanisms for Online Advertising: The Case of Google's Topics API

Alberto Verna

Methodology and Measurements of Privacy Mechanisms for Online Advertising: The Case of Google's Topics API.

Rel. Marco Mellia, Martino Trevisan, Nikhil Jha. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2024

[img]
Preview
PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (838kB) | Preview
Abstract:

In recent years, the Web industry has been moving towards the abandonment of third-party cookies in favour of more privacy-oriented solutions for targeted online advertising. Among the proposed alternatives, Google's Topics API -- a core component of the Privacy Sandbox framework -- stands above the rest. It is a browser-based solution for providing a user's interested topics to a third-party service (e.g. a digital advertising platform) without revealing the websites they visit. As the initial experimentation phase concludes, all components of the Privacy Sandbox, including the Topics API, have now reached general availability. However, some scepticism remains among other researchers who caution that, despite being a better approach than third-party cookies, this new solution may still lead to re-identification attacks and other privacy leaks. For this reason, Google is currently limiting the usage of the Privacy Sandbox family to select third-party services that must undergo an enrolment process. This thesis aims to quantify the adoption of the Topics API within the Web ecosystem, identify third parties that enable its usage, and examine the practices they employ. Given the handling of personal user information, said practices must also be taken in compliance with local privacy regulations. The analysis was performed by deploying a Chromium-based web crawler to visit the most popular 50,000 websites worldwide, recording the usages of the Topics API on each website. Because the crawling was performed in the EU, where the GDPR is in place, the crawler mimics the behaviour of a user accepting the privacy policy, by finding and clicking the "Accept" button of any banner found inside the visited page. This approach allows to distinguish the usages before and after user consent is provided, only in those websites where a privacy banner is found. The results of the crawling show that a substantial number of third parties are already experimenting with the new features offered by the Topics API, in preparation of the future phase-out of third-party cookies. However, it's evident that there is still some form of A/B testing taking place on a restricted amount of websites and users, most likely by the third parties. The same results show a significant number of questionable or even anomalous usages: some websites witness Topics API calls that (i) occur before the user accepts the privacy banner where found, even on European domains, or (ii) come from a third party that has not yet undergone the enrolment process, although the browser is expected to deny calls from unauthorised domains. This latter issue occurs due to an implementation bug found in Chromium's source code, which allows to bypass the browser's authorisation checks by manually deleting or corrupting a specific configuration file. Moreover, the majority of unauthorised domains appear to be same websites visited by the crawler, hinting at some popular JavaScript libraries accessing the Topics API erroneously. While this new technology has the potential to replace third-party cookies as the de facto standard for interest-based advertising, it is still in its early phases of deployment. The crawling results highlight several issues that are typically associated with early implementation: privacy regulation violations, implementation bugs that allow circumvention of abuse protections, and deployment errors.

Relatori: Marco Mellia, Martino Trevisan, Nikhil Jha
Anno accademico: 2024/25
Tipo di pubblicazione: Elettronica
Numero di pagine: 72
Soggetti:
Corso di laurea: Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering)
Classe di laurea: Nuovo ordinamento > Laurea magistrale > LM-32 - INGEGNERIA INFORMATICA
Aziende collaboratrici: Politecnico di Torino- SmartData@PoliTo
URI: http://webthesis.biblio.polito.it/id/eprint/33213
Modifica (riservato agli operatori) Modifica (riservato agli operatori)