Federated Learning Primer

Whether you're new to federated learning or just need a quick refresher, here's a concise overview of the key concepts.

Federated learning (FL) is a machine learning approach that trains models across multiple decentralized machines holding local data samples, without directly exchanging the raw data. Instead of collecting all data in one place, FL brings the training to where the data resides. This makes it particularly valuable when working with sensitive data or in scenarios where data privacy is paramount.

The FL Workflow

The federated learning process revolves around two main components: Clients and an Aggregator, which work together in an iterative cycle to improve a global machine learning model while maintaining data privacy.

Clients are distributed participants who own and train on private datasets. Each client receives a copy of the global model and performs training using only their local data. After training, clients generate model updates (e.g. neural network weights) that capture the insights learned from their data. By sharing only these updates - never the raw data - clients contribute to the collective model learning while keeping their data local.

The Aggregator orchestrates the entire workflow by (1) coordinating training rounds and (2) maintaining the global model. It distributes the model to selected clients, collects their updates, and uses aggregation strategies (e.g. federated averaging) to combine these insights into an improved global model. The aggregator also monitors training progress and initiates new rounds, ensuring systematic learning across the network while preserving client independence.

SyftBox Implementation Overview

Pre-trained Workflow

SyftBox will rely on two distinct Apps to implement the federated learning App:

The client App (fl_client) handles:

Loading private data from the datasite's private folder
Training the local neural network model
Storing model updates for synchronization with the aggregator

The aggregator App (fl_aggregator) manages:

Coordinating multiple training rounds
Collecting and aggregating model updates from all clients
Evaluating the global model
Distributing the improved model for the next round

Let's now learn how these two Apps work!

The FL Workflow​

SyftBox Implementation Overview​

The FL Workflow

SyftBox Implementation Overview