How to: Add your Data to the Network
This guide walks you through the process of adding your data to the SyftBox network with different access levels using the Syft Permissions system.
Before You Begin
- Ensure you have a SyftBox account and have logged in
- Understand the basics of the file structure in your datasite
- Familiarize yourself with the four permission types:
read
,create
,write
, andadmin
Understanding Data Access in SyftBox
SyftBox allows you to combine permissions flexibly to create any desired level of access for your data. A few notable configurations include:
- Private Data - Only you can see or modify your data
- Public Data - Anyone can view your data (but only you can modify)
- Semi-Public Data - Only specific people can view your data
- Collaborative Data - Specific users can both view and modify your data
You can mix and match these permission types to create custom access patterns for your specific needs.
How to Add Data with Specific Permissions
Important: Files in SyftBox inherit permissions from their parent folder. This means the simplest way to add data with specific permissions is to:
- Create a folder with your desired permission structure (using a
syftperm.yaml
file) - Add your data files to that folder
- The files will automatically adopt the permissions of the folder
This inheritance model means you don't need to set permissions for individual files—instead, organize your folder structure based on who needs access to which data. To change your data's permissions, move it to a folder with different permission settings.
Creating a Private Data Folder
When you want to keep your data completely private, create a private directory where only you have access.
Steps:
- Create a new folder in your SyftBox (e.g., "private-research")
- Add a
syftperm.yaml
file inside the folder with these contents:
- path: '**'
permissions:
- admin
- read
- write
user: your-email@example.org
- Add your data files to this folder
- Your data will only be accessible to you
This setup gives you complete control over your sensitive data while preventing access by others.
Sharing Data Publicly
To share your data with everyone in the network while maintaining control over modifications:
Steps:
- Create a new folder (e.g., "public-dataset")
- Add a
syftperm.yaml
file with:
- path: '**'
permissions:
- admin
- read
- write
user: your-email@example.org
- path: '**'
permissions:
- read
user: '*'
- Add your data files to this folder
- Anyone on the network can now read your data, but only you can modify it
This is perfect for datasets or resources you want to share broadly across the network.
Setting Up Semi-Public Data Access
When you want to share data with specific users but keep it private from others:
Steps:
- Create a new folder (e.g., "team-resources")
- Add a
syftperm.yaml
file with:
- path: '**'
permissions:
- admin
- read
- write
user: your-email@example.org
- path: '**'
permissions:
- read
user: colleague1@example.org
- path: '**'
permissions:
- read
user: colleague2@example.org
- Add your data files to this folder
- Only you and the specified colleagues can view this data
This approach works well for targeted data sharing with specific team members or trusted partners.
Creating a Collaborative Data Space
To enable multiple people to contribute to and modify a shared dataset:
Steps:
- Create a new folder (e.g., "team-project")
- Add a
syftperm.yaml
file with:
- path: '**'
permissions:
- admin
- read
- write
user: your-email@example.org
- path: '**'
permissions:
- read
- write
user: collaborator1@example.org
- path: '**'
permissions:
- read
- write
user: collaborator2@example.org
- Add initial data files to the folder
- Now both you and your collaborators can add and modify data in this space
This setup is ideal for team projects where multiple people need to contribute data and work together on analysis.
Advanced Data Organization Strategies
Mixed Permission Levels
You can create a data structure with different permission levels for different categories:
data-project/
├── syftperm.yaml (base permissions)
├── public/
│ └── syftperm.yaml (public read permissions)
├── team/
│ └── syftperm.yaml (team member permissions)
└── private/
└── syftperm.yaml (private permissions)
User-Specific Data Folders
You can create a structure where each user has their own data space:
- path: '**'
permissions:
- admin
- read
- write
user: your-email@example.org
- path: '{useremail}/**'
permissions:
- read
- write
user: '*'
This allows each user to manage their own data within their designated folder.
Making Your Data Discoverable on the Network
Even when your actual data is private, you can make it discoverable by others on the SyftBox network using a dataset.yaml
file.
Creating a Dataset Descriptor
- Create a
dataset.yaml
file with metadata about your dataset - Place it in a public folder (e.g.,
SyftBox/public/
)
Example dataset.yaml
:
version: "0.1.0"
datasets:
- name: "Netflix Data"
path: "~/SyftBox/datasites/youremail@example.org/datasets/netflix/NetflixViewingHistory"
dataset_loader: "SyftBox/datasites/aggregator@openmined.org/public/data_loader/json_loader.py"
description: "Primary dataset for user behavior analysis."
format: "CSV"
This approach allows others to discover your dataset's existence and description without being able to access the actual data unless you grant them permission.
Troubleshooting
- Permission Issues: If a user can't access data they should be able to, check for conflicting rules in parent directories or the syncing status
- Write Without Read: Remember that
write
andcreate
permissions requireread
permission to work properly - Owner Override: As the datasite owner, you always have full permissions regardless of the rules
By following this guide, you can add your data to the SyftBox network with various access levels to suit your specific needs for private, public, and collaborative workspaces. For more advanced permission management, check out the Syft Permissions reference.