Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Data Preparation logic for manual steps #486

Merged
merged 247 commits into from
Mar 15, 2024

Conversation

aristizabal95
Copy link
Contributor

@aristizabal95 aristizabal95 commented Sep 12, 2023

This PR adds the required logic for handling Data Preparation workflows where manual steps are needed in between. This is done by storing not-yet-prepared datasets in a staging folder, and allowing the user to resume preparation by using the same dataset name. Datasets now also create a report file containing information about the execution, and medperf automatically generates a summary for the user to determine what to do in case of errors or manual steps.
Closes #506

polish multithreading in dataset prepare
@hasan7n hasan7n temporarily deployed to testing-external-code March 15, 2024 04:18 — with GitHub Actions Inactive
@hasan7n hasan7n merged commit 869f378 into mlcommons:main Mar 15, 2024
8 checks passed
@github-actions github-actions bot locked and limited conversation to collaborators Mar 15, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEATURE] Implement Data Preparation logic for manual steps
2 participants