Introduction to FedScale
FedScale is a powerful and flexible open-source platform designed for federated learning (FL), a modern machine learning approach where decentralized data is utilized. This innovative project makes it easier for developers and researchers to implement FL algorithms, as well as to deploy and evaluate these models across various hardware and software environments. FedScale also offers the most extensive FL benchmark available, containing a wide array of tasks like image classification, object detection, language modeling, and speech recognition.
Getting Started
Quick Installation on Linux
To get started quickly on a Linux system, you can run the install.sh
script, which automates the installation process. If you're working with CUDA, you can simply append --cuda
to the installation command.
source install.sh # Add `--cuda` if needed
pip install -e .
Installation from Source on Linux/MacOS
If you prefer a more customized setup or are using MacOS, you can install FedScale from source. This requires Anaconda to be installed:
- Navigate to your FedScale directory.
- Set the
FEDSCALE_HOME
environment variable and create a handy alias. - Initialize your Conda environment and activate it.
- Finally, install any additional necessary packages and setup GPU support if required.
cd FedScale
FEDSCALE_HOME=$(pwd)
echo export FEDSCALE_HOME=$(pwd) >> ~/.bashrc
echo alias fedscale=\'bash $FEDSCALE_HOME/fedscale.sh\' >> ~/.bashrc
conda init bash
. ~/.bashrc
conda env create -f environment.yml
conda activate fedscale
pip install -e .
Tutorials
Once the installation is complete, you can dive into FedScale through a series of tutorials:
- Explore FedScale datasets – Learn about the different datasets available within FedScale.
- Deploy your FL experiment – Understand how to deploy a federated learning experiment.
- Implement an FL algorithm – Try implementing a federated learning algorithm using FedScale.
- Deploy FL on smartphones – Discover how to leverage FedScale for deploying FL on mobile devices.
FedScale Datasets
FedScale comprises over 20 large-scale, diversified datasets suitable for federated learning tasks. It spans multiple domains such as computer vision and natural language processing, among others. For each dataset, training, validation, and testing subsets are provided to ensure a comprehensive setup for model development and assessment. Contributors to these datasets are acknowledged, and users are encouraged to explore and contribute further.
FedScale Runtime
The FedScale Runtime is a robust platform for both deploying and evaluating federated learning models. Building on FedScale's predecessor, Oort, this runtime efficiently scales FL experiments to include thousands of clients per round. Comprehensive documentation aids users in setting up training scripts and deploying models effectively, even on mobile devices.
Repository Structure
For those interested in more technical aspects, the FedScale repository is organized into key sections including the core source code, deployment tools, benchmarking datasets, example configurations, and documentation.
References
FedScale has been recognized in various academic conferences, and more details can be found in papers presented at the International Conference on Machine Learning (ICML) and the USENIX Symposium on Operating Systems Design and Implementation (OSDI).
Contributions and Communication
FedScale invites contributions from the community. Users can engage by submitting issues or pull requests on GitHub. For communication and support, there is an active Slack channel, or users can reach out via email for any questions or feedback.
FedScale represents an exciting opportunity to advance the field of federated learning, with its extensive resources, supportive community, and focus on scalability and extensibility.