Politecnico di Torino (logo)

Accelerating Federating Learning via In-Network Processing

Vera Altamore

Accelerating Federating Learning via In-Network Processing.

Rel. Guido Marchetto, Alessio Sacco. Politecnico di Torino, Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering), 2022

PDF (Tesi_di_laurea) - Tesi
Licenza: Creative Commons Attribution Non-commercial No Derivatives.

Download (7MB) | Preview

The unceasing development of Machine Learning (ML) and the evolution of DeepLearning have revolutionized many application domains, ranging from natural language processing, to video analytics, to biology and medical predictions. The most common approach for ML models training is cloud-centric, so data owners transmit the training data to a public cloud server for processing, where resides more powerful resources. However, this approach is often unfeasible due to privacy laws and restrictions, as well as the burdening of network communications because of the massive quantities of data that need to be transmitted to a distant cloud server. To solve these problems, Google introduced in 2016 the concept of Federated Learning (FL) with the objective of building machine learning models that takes into account security and privacy of data. In FL, instead of transferring the data to the central servers, the ML model itself is deployed to the individual devices to train on the data, and only the parameters of the trained models are sent to the central ML/DL model for global training. Thanks to this principle, FL is widely used today in sales, financial, medical, and Internet of Things (IoT) fields, where the privacy of data is essential. In particular, the underlying architecture can include many devices, each one with its own dataset, and a central server, which is responsible for the aggregation of data in order to build a global model. However, despite the privacy and security benefits, this approach can lead to synchronization issues, and the network and the server turn in bottlenecks and the load may become unsustainable. To this aim, this thesis proposes a novel FL model that uses programmable P4 switches to compute intermediate aggregations and reduce the traffic on the network. The use of edge nodes for in-network model caching and gradient aggregating alleviates the bottleneck effect of the central FL server and further accelerates the entire training progress. In detail, we modified a traditional FL framework, as Flower, to communicate with P4 switches using a custom protocol to carry the model parameters. We also adapted the P4 switch behavior to support gradient aggregation. In addition, in this work we compare the execution time of the proposed model against current state-of-the-art models and verify the speedup of the global training phase.

Relators: Guido Marchetto, Alessio Sacco
Academic year: 2021/22
Publication type: Electronic
Number of Pages: 92
Corso di laurea: Corso di laurea magistrale in Ingegneria Informatica (Computer Engineering)
Classe di laurea: New organization > Master science > LM-32 - COMPUTER SYSTEMS ENGINEERING
Aziende collaboratrici: Politecnico di Torino
URI: http://webthesis.biblio.polito.it/id/eprint/22674
Modify record (reserved for operators) Modify record (reserved for operators)