Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate with syft #28

Closed
5 tasks done
TTitcombe opened this issue Jun 19, 2020 · 2 comments
Closed
5 tasks done

Integrate with syft #28

TTitcombe opened this issue Jun 19, 2020 · 2 comments
Labels
Priority: 2 - High 😰 Should be fixed as quickly as possible, ideally within the current or following sprint Type: Epic 🤙 Describes a large amount of functionality that will likely be broken down into smaller issues
Milestone

Comments

@TTitcombe
Copy link
Member

TTitcombe commented Jun 19, 2020

What?

The current implementation of data loaders and datasets is quite hacky.
We should integrate existing syft functionality and extend it to make Vertically-partitioned dataset
a robust class, making it easy for anyone to apply PyVertical to any dataset

Breakdown

  • Build on syft.fl.BaseDataset to create a dataset which holds partitions and may hold either data or targets. This should extend PyVertical's VerticalDataset to include syft functionality of ownership Extend syft federated datasets #47
  • Create a function dataset_partition which partitions a dataset, sends the partitioned datasets to the correct worker, and returns a syft.fl.FederatedDataset of partitioned datssets. This builds on the current partition_dataset function in PyVertical, and is similar to syft.fl.dataset_federate Create partition function for federated datasets #48
  • Replace PartitionDistributingDataLoader with a dataloader which takes a syft.fl.FederatedDataset. This should extend syft.fl.FederatedDataLoader to account for datasets which may not contain data or targets Create syft-like federated dataloader #49
  • Integrate with PSI Integrate PSI with workers #50
  • Encrypt unique IDs Encrypt IDs #54

Additional Context

This will developed simultaneously with the extended PyVertical demonstration (#25), so to avoid breaking changes existing dataloaders/data splitters should be kept until this issue is complete

@TTitcombe TTitcombe added Priority: 3 - Medium 😒 Should be fixed soon, but there may be other pressing matters that come first Type: Refactor 🔨 A complete overhaul of a file, feature, or codebase labels Jun 19, 2020
@TTitcombe
Copy link
Member Author

cc @tudorcebere from PySyft

@TTitcombe TTitcombe added Priority: 2 - High 😰 Should be fixed as quickly as possible, ideally within the current or following sprint and removed Priority: 3 - Medium 😒 Should be fixed soon, but there may be other pressing matters that come first labels Jul 4, 2020
@TTitcombe TTitcombe changed the title Make partitioned datasets more syft-like Integrate with syft Jul 11, 2020
@TTitcombe TTitcombe added Type: Epic 🤙 Describes a large amount of functionality that will likely be broken down into smaller issues and removed Type: Refactor 🔨 A complete overhaul of a file, feature, or codebase labels Jul 11, 2020
@TTitcombe TTitcombe removed this from the Extended example milestone Jul 11, 2020
@TTitcombe TTitcombe pinned this issue Jul 15, 2020
@TTitcombe TTitcombe added this to the Syft 3.2 milestone Aug 5, 2020
@TTitcombe
Copy link
Member Author

closing as issues do not align with current roadmap for syft 0.3.0

@TTitcombe TTitcombe unpinned this issue Nov 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Priority: 2 - High 😰 Should be fixed as quickly as possible, ideally within the current or following sprint Type: Epic 🤙 Describes a large amount of functionality that will likely be broken down into smaller issues
Projects
None yet
Development

No branches or pull requests

1 participant