polito.it
Politecnico di Torino (logo)

Sequential to Parallel Federated Learning with Semantic-Aware Client Groupings

Andrea Silvi

Sequential to Parallel Federated Learning with Semantic-Aware Client Groupings.

Rel. Barbara Caputo, Debora Caldarola, Marco Ciccone. Politecnico di Torino, Corso di laurea magistrale in Data Science And Engineering, 2022

Abstract:

Federated Learning (FL) has recently received a lot of attention in academic and industry research, since it enables different devices, or clients, to collaboratively train a shared model while keeping their local data private. This paradigm allows for the participation of data that would otherwise be unavailable in the training phase due to privacy concerns. Furthermore, FL also comes in handy when the sheer volume of data involved in the training phase is too large for a single data storage to handle. While enabling learning from privacy-protected data comes with many benefits, this paradigm also introduces some challenges: the first one is the cost of communication between the server, which manages all training operations, and the clients; in addition, the heterogeneity that may exist among the various clients participating in the training phase may degrade FL algorithms performance and significantly slow down convergence. Recent literature has attempted to address this issue by examining regularization techniques or drawing inspiration from the world of multitask learning. This thesis aims at extending the method FedSeq, an anti-clustering algorithm that aggregates heterogeneous clients and trains the model sequentially in between the groups in a privacy-compliant manner. One disadvantage of FedSeq is the requirement for a server-side public dataset in order to perform meaningful client groupings, which may not always be available in real-world situations. Thus, in this work we first propose a solution to efficiently group clients according to their data distribution without requiring a public dataset, while still achieving on-par performances. Another issue of FedSeq is the non-existent training parallelization within groups, as each client must wait for the previous one to complete its local training, resulting in a longer training phase than other FL algorithms. Here, we investigate new ways to speed up this process by borrowing the Sequential-to-Parallel concept, which allows us to explore a more dynamic method of aggregating clients and training a model. Specifically, every few rounds, the algorithm dynamically clusters clients with different class distributions into groups that grow in number and shrink in size as the algorithm is trained, effectively switching from a mostly sequential training at the beginning of training on a few, very large groups of clients to a highly parallelized training on a lot of large, extremely small groups of clients at the end. When combined with our client distribution extraction approaches, we show this method can accelerate convergence and reduce communication costs. Finally, since FedSeq performs at state-of-the-art level on heterogeneous computer vision toy datasets, we further test its generalization and the fast-convergence capabilities on a broader range of computer vision and natural language processing datasets.

Relators: Barbara Caputo, Debora Caldarola, Marco Ciccone
Academic year: 2022/23
Publication type: Electronic
Number of Pages: 93
Additional Information: Tesi secretata. Fulltext non presente
Subjects:
Corso di laurea: Corso di laurea magistrale in Data Science And Engineering
Classe di laurea: New organization > Master science > LM-32 - COMPUTER SYSTEMS ENGINEERING
Aziende collaboratrici: UNSPECIFIED
URI: http://webthesis.biblio.polito.it/id/eprint/25566
Modify record (reserved for operators) Modify record (reserved for operators)