SyftBox Computation Model

By now, you should have already installed SyftBox as covered in the introduction. If not, check out the SyftBox intro.

Once installed, you'll find a SyftBox folder on your system. Inside this folder, the most important two main components are Apps and Datasites.

note

Apps allow you to propose and participate in computations within SyftBox. Datasites, on the other hand, represent individual entities in the Syft network - individuals, organizations, etc and contain their private and public data.

Now, let's dive into how you can install an App on your SyftBox and start being part of a computation.

Workflow overview

Applying computations on private data works through a workflow that implies 2 parties:

Data Owners -> the party that owns part of the private data taking part in the computation
App Developer -> the party that wants to apply the computation on the private data; we can see also them as research proposers

Since the App developer is the one proposing the study, they will be in charge of writing both the App designed to run on the data owner's Datasite (preparing the data for aggregation) and the App designed to run on their Datasite.

note

Each party will manage their own Datasite and SyftBox App. For the purpose of these tutorials we'll label data owner Datasites with A, B etc. and developer Datasites with X, Y etc.

Data owner setup

A data owner takes part in a computation by installing a SyftBox App following a developer's proposal. The Data Owner App is designed to apply a computation on the private data on their Datasite and write a publicly available result.

note

A Data Owner is always in control of their private data and what code runs on it.

Datasite A - a data owner's Datasite
data - the private data stored on the data owner's Datasite
data_owner_app - the App running on the data owner's Datasite
public result - a result computed from the private data, which can be used for aggregations on other Datasites

Developer setup

A developer will install another App on their own datasite, which will aggregate the public results computed on the Data Owner's Datasites.

Datasite A & Datasite B - Datasites belonging to data owners taking part in the computation
aggregator_app - the App running on the proposing developer's Datasite
final result - the final result of the aggregation

Step by step

The whole workflow looks something like this:

data owners (A and B) each prepare their private data (CSV, JSON, or any other format supported by the Apps)
data owners (A and B) each install the data_owner_app App on their Datasites (developed by the App developer)
the developer (X) installs the aggregator_app App on their Datasite.
the developer (X) will soon see the aggregation result on their Datasite.

That's it! SyftBox takes care of syncing the intermediary public results between datasites so the Apps can do their job.

note

For this workflow to work, every party needs to have their client running and connected to the same Syft network.

Example: CPU Tracker App

The CPU Tracker App is a simple example of what could be build on SyftBox: it's an application that gathers CPU data from participating Datasites and displays them in a chart after aggregating them.

Click here to see a live example of the CPU Tracker App running on the main Syft network.

Installation

To install the CPU Tracker, follow these steps:

Make sure your SyftBox client is running
Click on the "Install CPU_tracker_member" button on the top-left of the page

note

Another way to download the App will be to clone this repo and move the cpu_tracker_member folder to your syftbox/apps folder.

That's it! You're now part of you're now part of the computation! Your CPU load will be included in the aggregation, helping to calculate the average CPU load across the Syft network. Your Datasite will appear in the "Active Peers" list participating in the computation.

How does the App work?

After installing the CPU Tracker App, you'll notice a new App called cpu_tracker_member in the apps folder of your SyftBox directory. This App is defined by two key files: main.py and run.sh, which work together to perform a certain function using data from your Datasite.

SyftBox
├── datasites
│   └── ...
└── apps
    ├── ...
    └── cpu_tracker_member
        ├── main.py
        ├── run.sh
        └── ...

A quick glance at the main.py script shows that the App collects 50 data points from your CPU usage at specific intervals and averages them (adding noise to ensure a degree of privacy). The processed result is then placed in a public folder to make it available for aggregation.

The cpu_tracker App also creates a file on your Datasite, located in the app_data folder, which contains your average CPU usage data for a specific time frame.

SyftBox
├── apps
│   └── cpu_tracker_member
└── datasites
    └── YOUR_DATASITE
        └── app_data
            └── cpu_tracker
                └── cpu_tracker.json

The cpu_tracker_member App automatically manages the loading and processing of your data so it can be included in the global aggregation.

In the next tutorial, we'll dive deeper and learn how to build an App like this from scratch!

Workflow overview​

Data owner setup​

Developer setup​

Step by step​

Example: CPU Tracker App​

Installation​

How does the App work?​

Workflow overview

Data owner setup

Developer setup

Step by step

Example: CPU Tracker App

Installation

How does the App work?