Emanuel Buttaci
Does Perturbation Aid Convergence? An Alternative Approach to Federated Optimization Inspired by Spectral Graph Theory.
Rel. Giuseppe Carlo Calafiore, Federico Della Croce Di Dojola. Politecnico di Torino, Corso di laurea magistrale in Data Science And Engineering, 2024
|
PDF (Tesi_di_laurea)
- Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives. Download (1MB) | Preview |
|
Archive (ZIP) (Documenti_allegati)
- Altro
Licenza: Creative Commons Attribution Non-commercial No Derivatives. Download (1MB) |
Abstract: |
Federated learning has emerged in the last decade as a distributed optimization paradigm due to the rapidly increasing number of devices, such as user smartphones, that support heavier computation to train machine learning models synergically. Since its early days, federated learning has used gradient-based optimization to minimize a shared loss objective across participating agents. In this respect, the statistical heterogeneity between users' datasets has always been a conspicuous obstacle to the global convergence of the shared optimization procedure. In the first part of this thesis, we propose a fresh interpretation of such heterogeneity through a mathematical framework that reimagines any federated network as a similarity graph based on the statistical discrepancies between clients' data. Therefore, we reformulate an alternative notion of heterogeneity and highlight its connection to the spectrum of the graph laplacian. Our model shows how a network statistically evolves as we alter the overall dissimilarity between its clients. In the second part of our dissertation, we focus on the convergence properties of federated optimization algorithms, and we propose a novel framework where each client locally performs a perturbed gradient step leveraging prior information about other statistically similar clients. Furthermore, choosing the popular algorithm FedProx as a baseline, we provide its convex and nonconvex convergence analysis under the smoothness assumption along with our algorithm. Therefore, we theoretically claim that our procedure, due to a minor change in the update rule, achieves a quantifiable speedup concerning the exponential contraction factor in the strongly convex case compared with the baseline. Lastly, using FedAvg as a term of comparison, we legitimize our conclusions through experimental results on the CIFAR10 and FEMNIST datasets, where we show that our algorithm hastens convergence by a margin of 30 rounds while modestly improving generalization on unseen data in heterogeneous settings. |
---|---|
Relatori: | Giuseppe Carlo Calafiore, Federico Della Croce Di Dojola |
Anno accademico: | 2023/24 |
Tipo di pubblicazione: | Elettronica |
Numero di pagine: | 126 |
Soggetti: | |
Corso di laurea: | Corso di laurea magistrale in Data Science And Engineering |
Classe di laurea: | Nuovo ordinamento > Laurea magistrale > LM-32 - INGEGNERIA INFORMATICA |
Ente in cotutela: | UNIVERSIDAD POLITECNICA DE MADRID - ETS DE INGENIEROS INFORMATICOS (SPAGNA) |
Aziende collaboratrici: | Universidad Politecnica de Madrid |
URI: | http://webthesis.biblio.polito.it/id/eprint/31844 |
Modifica (riservato agli operatori) |