Skip to main content

Federated Learning Primer

Whether you're new to federated learning or just need a quick refresher, here's a concise overview of the key concepts.

Federated learning (FL) is a machine learning approach that trains models across multiple decentralized machines holding local data samples, without directly exchanging the raw data. Instead of collecting all data in one place, FL brings the training to where the data resides. This makes it particularly valuable when working with sensitive data or in scenarios where data privacy is paramount.

The FL Workflow

The federated learning process revolves around two main components: Clients and an Aggregator, which work together in an iterative cycle to improve a global machine learning model while maintaining data privacy.


Clients are distributed participants who own and train on private datasets. Each client receives a copy of the global model and performs training using only their local data. After training, clients generate model updates (e.g. neural network weights) that capture the insights learned from their data. By sharing only these updates - never the raw data - clients contribute to the collective model learning while keeping their data local.

The Aggregator orchestrates the entire workflow by (1) coordinating training rounds and (2) maintaining the global model. It distributes the model to selected clients, collects their updates, and uses aggregation strategies (e.g. federated averaging) to combine these insights into an improved global model. The aggregator also monitors training progress and initiates new rounds, ensuring systematic learning across the network while preserving client independence.

SyftBox Implementation Overview

Pre-trained App Workflow

SyftBox will rely on two distinct APIs to implement the federated learning workflow:

  1. The client API (fl_client) handles:
  • Loading private data from the datasite's private folder
  • Training the local neural network model
  • Storing model updates for synchronization with the aggregator
  1. The aggregator API (fl_aggregator) manages:
  • Coordinating multiple training rounds
  • Collecting and aggregating model updates from all clients
  • Evaluating the global model
  • Distributing the improved model for the next round

Let's now learn how these two APIs work!