-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use env for repro tests, improve repro names #3430
Conversation
50f4bb2
to
6acb6d1
Compare
What's the issue with the 4 GPU job? |
Not sure yet, still needs fixes.. |
6acb6d1
to
dad1b1b
Compare
I incremented the reference counter because I think it's far easier to "reset" our comparable references than to support comparing data exported from HDF5 formats (which we now use to support reproducibility tests for GPU jobs) with NC files (which we previously used). |
710c7e6
to
aed070d
Compare
Fixes for gpu repro tests, auto-compare all state variables Improve error message, increment ref counter Fix zero_dict calls Fix dict init Fixes to zero_dict Improve debug info
2d6a653
to
f802f8b
Compare
The part that I was getting stuck on was that the "print new MSE tables" job was not finding any mse json files, but that's because they never got moved over (because we hadn't set the |
If the restart job gets tripped up again, then I'll manually merge. |
This PR:
reproducibility_test
flag in the yaml config, and we instead use an environment variable. I don't think experiments should explicitly need to worry about this, and there's no way for users to leverage this feature locally anyway, so an env flag seems more appropriate.