Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

set up resuming for failed runs #57

Open
iejMac opened this issue Oct 15, 2022 · 4 comments
Open

set up resuming for failed runs #57

iejMac opened this issue Oct 15, 2022 · 4 comments

Comments

@iejMac
Copy link
Owner

iejMac commented Oct 15, 2022

something like rom suggested where you generate input shard first then you just test if the output shard for that input shard already exists

@iejMac
Copy link
Owner Author

iejMac commented Oct 15, 2022

  • some additional json for if the shard is complete or not

@iejMac
Copy link
Owner Author

iejMac commented Nov 20, 2022

make it like in video2dataset using input_sharder and all

@iejMac
Copy link
Owner Author

iejMac commented Nov 20, 2022

or we can hack it for now: when you're done writing a shard, write a key_stat.json file as a signal for that shard being done

@iejMac
Copy link
Owner Author

iejMac commented Nov 20, 2022

based on found .json files in dataset folder update this

starting_shard_id += math.ceil(work_size / shard_sample_count) * global_rank

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant