Skip to main content

How to: Add your Data to the Network

This guide walks you through the process of adding your data to the SyftBox network with different access levels using the Syft Permissions system.

Before You Begin

  • Ensure you have a SyftBox account and have logged in
  • Understand the basics of the file structure in your datasite
  • Familiarize yourself with the four permission types: read, create, write, and admin

Understanding Data Access in SyftBox

SyftBox allows you to combine permissions flexibly to create any desired level of access for your data. A few notable configurations include:

  1. Private Data - Only you can see or modify your data
  2. Public Data - Anyone can view your data (but only you can modify)
  3. Semi-Public Data - Only specific people can view your data
  4. Collaborative Data - Specific users can both view and modify your data

You can mix and match these permission types to create custom access patterns for your specific needs.

How to Add Data with Specific Permissions

Important: Files in SyftBox inherit permissions from their parent folder. This means the simplest way to add data with specific permissions is to:

  1. Create a folder with your desired permission structure (using a syftperm.yaml file)
  2. Add your data files to that folder
  3. The files will automatically adopt the permissions of the folder

This inheritance model means you don't need to set permissions for individual files—instead, organize your folder structure based on who needs access to which data. To change your data's permissions, move it to a folder with different permission settings.

Creating a Private Data Folder

When you want to keep your data completely private, create a private directory where only you have access.

Steps:

  1. Create a new folder in your SyftBox (e.g., "private-research")
  2. Add a syftperm.yaml file inside the folder with these contents:
- path: '**'
permissions:
- admin
- read
- write
user: your-email@example.org
  1. Add your data files to this folder
  2. Your data will only be accessible to you

This setup gives you complete control over your sensitive data while preventing access by others.

Sharing Data Publicly

To share your data with everyone in the network while maintaining control over modifications:

Steps:

  1. Create a new folder (e.g., "public-dataset")
  2. Add a syftperm.yaml file with:
- path: '**'
permissions:
- admin
- read
- write
user: your-email@example.org

- path: '**'
permissions:
- read
user: '*'
  1. Add your data files to this folder
  2. Anyone on the network can now read your data, but only you can modify it

This is perfect for datasets or resources you want to share broadly across the network.

Setting Up Semi-Public Data Access

When you want to share data with specific users but keep it private from others:

Steps:

  1. Create a new folder (e.g., "team-resources")
  2. Add a syftperm.yaml file with:
- path: '**'
permissions:
- admin
- read
- write
user: your-email@example.org

- path: '**'
permissions:
- read
user: colleague1@example.org

- path: '**'
permissions:
- read
user: colleague2@example.org
  1. Add your data files to this folder
  2. Only you and the specified colleagues can view this data

This approach works well for targeted data sharing with specific team members or trusted partners.

Creating a Collaborative Data Space

To enable multiple people to contribute to and modify a shared dataset:

Steps:

  1. Create a new folder (e.g., "team-project")
  2. Add a syftperm.yaml file with:
- path: '**'
permissions:
- admin
- read
- write
user: your-email@example.org

- path: '**'
permissions:
- read
- write
user: collaborator1@example.org

- path: '**'
permissions:
- read
- write
user: collaborator2@example.org
  1. Add initial data files to the folder
  2. Now both you and your collaborators can add and modify data in this space

This setup is ideal for team projects where multiple people need to contribute data and work together on analysis.

Advanced Data Organization Strategies

Mixed Permission Levels

You can create a data structure with different permission levels for different categories:

data-project/
├── syftperm.yaml (base permissions)
├── public/
│ └── syftperm.yaml (public read permissions)
├── team/
│ └── syftperm.yaml (team member permissions)
└── private/
└── syftperm.yaml (private permissions)

User-Specific Data Folders

You can create a structure where each user has their own data space:

- path: '**'
permissions:
- admin
- read
- write
user: your-email@example.org

- path: '{useremail}/**'
permissions:
- read
- write
user: '*'

This allows each user to manage their own data within their designated folder.

Making Your Data Discoverable on the Network

Even when your actual data is private, you can make it discoverable by others on the SyftBox network using a dataset.yaml file.

Creating a Dataset Descriptor

  1. Create a dataset.yaml file with metadata about your dataset
  2. Place it in a public folder (e.g., SyftBox/public/)

Example dataset.yaml:

version: "0.1.0"

datasets:
- name: "Netflix Data"
path: "~/SyftBox/datasites/youremail@example.org/datasets/netflix/NetflixViewingHistory"
dataset_loader: "SyftBox/datasites/aggregator@openmined.org/public/data_loader/json_loader.py"
description: "Primary dataset for user behavior analysis."
format: "CSV"

This approach allows others to discover your dataset's existence and description without being able to access the actual data unless you grant them permission.

Troubleshooting

  • Permission Issues: If a user can't access data they should be able to, check for conflicting rules in parent directories or the syncing status
  • Write Without Read: Remember that write and create permissions require read permission to work properly
  • Owner Override: As the datasite owner, you always have full permissions regardless of the rules

By following this guide, you can add your data to the SyftBox network with various access levels to suit your specific needs for private, public, and collaborative workspaces. For more advanced permission management, check out the Syft Permissions reference.