Anna Origlia
An Asynchronous Framework to Mitigate the Network Impact on Federated Learning.
Rel. Paolo Giaccone, Claudio Ettore Casetti. Politecnico di Torino, Corso di laurea magistrale in Communications And Computer Networks Engineering (Ingegneria Telematica E Delle Comunicazioni), 2023
|
PDF (Tesi_di_laurea)
- Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives. Download (5MB) | Preview |
Abstract: |
Federated learning is a technique which has been introduced to evolve machine learning and to provide a distributed learning structure more suited to be applied to complex environments such as IoT or privacy-preserving applications. The federated aggregation process is controlled by a parameter server, but, differently from centralised ML, an additional on-site step of local training is added in the devices; in this way, data is not disclosed and the communication overhead reduces significantly. Because of its distributed nature, FL faces a highly heterogeneous environment; clients differ in computational capabilities (especially considering IoT devices), datasets distributions (data is generated or collected on the device, and this may bias or pollute the population), reliability (clients may drop out mid-run). In addition, the network plays an important role in the client-server communication, as it affects the total time needed by a client's update to reach the server. In synchronous FL, slow clients become the bottleneck of the whole process. In this work, we address the effect that the network bandwidth and latency have on FL rounds and we propose a new framework for coping with it: our system architecture is composed by a centralised server, a pool of available clients and a number of mediators that act as the middle-man between clients and server, coordinating their communication. The mediators are positioned on the cloud edge, so to reduce as much as possible the clients’ updates transmission latency; also, we consider that the bandwidth and delay between the mediators and the server is fixed and known. Mediators can run a request-acknowledgement procedure with each client before sending the round training instructions: this procedure is used to estimate each client's network conditions and to modify its behaviour to compensate for the network delay. The proposed framework uses an asynchronous configuration: the aggregation strategy is an asynchronous version of FedAvg, with the advantage of being tolerant to stragglers; stragglers updates are scaled based on their staleness degree and then included by the server. |
---|---|
Relators: | Paolo Giaccone, Claudio Ettore Casetti |
Academic year: | 2022/23 |
Publication type: | Electronic |
Number of Pages: | 77 |
Subjects: | |
Corso di laurea: | Corso di laurea magistrale in Communications And Computer Networks Engineering (Ingegneria Telematica E Delle Comunicazioni) |
Classe di laurea: | New organization > Master science > LM-27 - TELECOMMUNICATIONS ENGINEERING |
Aziende collaboratrici: | UNSPECIFIED |
URI: | http://webthesis.biblio.polito.it/id/eprint/26769 |
Modify record (reserved for operators) |