Potential of DECICE for Federated Learning Use Cases

Federated Learning (FL) in the DECICE project involves a decentralized machine learning model shared across edge devices. In FL each device locally processes data for model training and sends updates to the central server, reducing bandwidth usage and preserving data privacy. FL offers privacy by avoiding raw data sharing and minimizes bandwidth use, though local compute power is essential for efficient training.

Federated Learning (FL) is a decentralized machine learning model in which a global prediction model is shared with multiple edge devices. Each device trains part of the model with locally available or stored data and only model updates, typically in the form of gradients, are sent to the central server. This scheme reduces the usage of network bandwidth and enhances data privacy as local training data is not transmitted.

As an example, see the next-word prediction on a mobile keyboard [1], as illustrated in the following figure. User input can be used to train language models across a large number of mobile phones to predict the next word or phrase, enabling all users to benefit from auto-completing their next word based on the predictions. However, users might wish to protect their privacy and would be reluctant to share their data as well as wish to avoid wasting their limited bandwidth/battery power on their phone. According to Li et al. [1] federated learning algorithms have the potential to demonstrate the feasibility and benefit of training language models on the historical data available on client devices without exporting sensitive information to the servers.

In the setup shown in the following figure, each of the mobile phones gathers data based on user input and trains part of the model based on the local data. During each communication round a subset of mobile phones sends their locally trained model as an update to the central server. This update is much smaller than the raw data and does not easily allow for deducing the original user input from the model. The updates from the users are incorporated on the server into a new global model, which is then rolled out to a number of devices. Assuming no further updates to the training data, this process lets the network eventually converge on a global model and the training process is completed.

Scheduling in Edge Federated Learning

Edge devices are inherently heterogeneous. In each iteration of a synchronous federated learning, the participating devices need to update their computational results at the central server. Thus, training speed is dictated by the slowest device and network bandwidth. Hence, an efficient scheduler needs to allocate optimal edge resources for a given application. Currently, there are at least four major research directions to achieve this goal [2].

Participant selection: In this scheme, initially some nodes are randomly selected as participants to perform the computations using their local private data for one iteration and their training speed is analyzed. Further, the node selection is posed as a mathematical optimization problem based on the computational resources and previous training time information. The effects of different scheduling policies, e.g., random scheduling (RS), round robin (RR), and proportional fair (PF), are studied for a given problem setting [3].
Resource optimization: Due to the heterogeneity of edge nodes, the available resources in terms of computational power and network bandwidth may vary drastically. To improve resource allocation, nodes with more compute power should be given a bigger share of compute tasks.
Asynchronous training: The majority of edge federated learning research is focused on synchronous training, however, asynchronous training could provide an opportunity to significantly increase the capabilities of federated learning in heterogeneous environments.
Incentive Mechanism: In environments with independent users, it might be necessary to provide some form of incentive to motivate users to collaborate in a federated learning network. Some researchers have been looking into strategies for providing compensation to users to reward them for their compute power of input data.

Federated Learning in DECICE

The two main advantages of FL are privacy, due to not sharing raw training data with peers, and reduced bandwidth usage, as raw training data is processed locally instead of transmitted via a potentially slow network. This comes at the cost of having to provide enough compute power to perform the training locally. Assuming a number of nodes, including cloud and edge nodes, are controlled via the DECICE framework, then these nodes are all also running a Kubernetes or KubeEdge stack in order to connect to the cluster. Furthermore, in this assumption, the edge nodes are connected to sensors that collect data, which is interesting for training a machine learning model. If all the nodes are connected into a cluster, they already have to trust the master nodes as these may submit arbitrary workloads on the other nodes, including accessing the locally stored data. Therefore, the advantage of privacy via FL is negated as there is no privacy for the edge nodes in this scenario from the master nodes. Nevertheless, considering the second advantage of reduced bandwidth usage, if the training data is, for example, a camera feed, then the network of the edge nodes could be incapable of streaming this data to a central server while maintaining a level of quality required for training a ML model. In this case, the solution would be to perform pre-processing or even training via FL on the edge nodes to reduce the load on the network. This still requires the edge nodes to be capable of performing said computations themselves.

Conclusion

In the DECICE project, we are actively investigating multiple potential directions, though we recognize that not all of them can be integrated into the initial platform implementation. However, we are eagerly anticipating the opportunity to implement Federated Learning in future iterations, enhancing the platform’s capabilities and furthering our goals. Author: Mirac Aydin Reference (Books, Online, etc) 1. Tian Li et al. “Federated Learning: Challenges, Methods, and Future Directions”. In: IEEE Signal Processing Magazine 37.3 (May 2020), pp. 50–60. ISSN: 1558-0792. DOI:10.1109/MSP.2020.2975749. 2. Qi Xia et al. “A survey of federated learning for edge computing: Research problems and solutions”. In: High-Confidence Computing 1.1 (2021), p. 100008. ISSN: 2667-2952. DOI: https://doi.org/10.1016/j.hcc.2021.100008. URL: https://www.sciencedirect.com/science/article/pii/S266729522100009X. 3. Howard H. Yang et al. “Scheduling Policies for Federated Learning in Wireless Networks”. In: IEEE Transactions on Communications 68.1 (Jan. 2020), pp. 317–333. issn: 1558-0857. doi: 10.1109/TCOMM.2019.2944169.

Links

https://www.uni-goettingen.de/ https://www.gwdg.de/

Keywords

Federated learning, DECICE, edge devices, asynchronous training, data privacy, decentralized machine learning, cloud framework

Spread the love

Potential of DECICE for Federated Learning Use Cases

Scheduling in Edge Federated Learning

Federated Learning in DECICE

Conclusion

Links

Keywords

Let’s meet the brains behind the DECICE data-driven AI digital twin

DECICE Consortium Meeting Vienna 06.–08.11.2023

DECICE General Assembly Meeting 27.-28.03.2024

Contact

Project Facts

Useful links

Newsletter

Potential of DECICE for Federated Learning Use Cases

Scheduling in Edge Federated Learning

Federated Learning in DECICE

Conclusion

Links

Keywords

Related posts

Let’s meet the brains behind the DECICE data-driven AI digital twin

DECICE Consortium Meeting Vienna 06.–08.11.2023

DECICE General Assembly Meeting 27.-28.03.2024