Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move from torch.cuda.amp to torch.autocast; Add tests for amp #838

Open
wants to merge 33 commits into
base: main
Choose a base branch
from

Conversation

misko
Copy link
Collaborator

@misko misko commented Sep 9, 2024

This PR updates torch.cuda.amp.autocast(args...) and torch.cuda.amp.GradScaler(args...) (deprecated) and also adds CPU AMP tests. ( https://pytorch.org/docs/stable/amp.html )
Recently there was an eSCN model that did not run on GPU AMP and the current set of tests did not catch it. This PR adds those equivalent tests on CPU AMP, which will test this going forward.
Additional fixes haven been made to eSCN / SCN / Gemnet OC in this PR.

@misko misko marked this pull request as ready for review September 9, 2024 23:52
@misko misko added enhancement New feature or request patch Patch version release labels Sep 10, 2024
@misko misko mentioned this pull request Sep 10, 2024
Copy link
Collaborator

@lbluque lbluque left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @misko, mostly style suggestions

src/fairchem/core/models/equiformer_v2/layer_norm.py Outdated Show resolved Hide resolved
src/fairchem/core/models/escn/escn.py Outdated Show resolved Hide resolved
src/fairchem/core/modules/scaling/fit.py Show resolved Hide resolved
tests/core/e2e/test_s2ef.py Show resolved Hide resolved
Copy link

codecov bot commented Sep 19, 2024

Codecov Report

Attention: Patch coverage is 92.85714% with 2 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
...e/models/equiformer_v2/equiformer_v2_deprecated.py 0.00% 1 Missing ⚠️
src/fairchem/core/trainers/base_trainer.py 83.33% 1 Missing ⚠️
Files with missing lines Coverage Δ
...rc/fairchem/applications/cattsunami/core/ocpneb.py 87.12% <100.00%> (+0.09%) ⬆️
src/fairchem/core/common/relaxation/ase_utils.py 63.15% <100.00%> (+1.20%) ⬆️
...c/fairchem/core/models/equiformer_v2/layer_norm.py 58.62% <100.00%> (+0.72%) ⬆️
src/fairchem/core/models/escn/escn.py 95.65% <100.00%> (ø)
src/fairchem/core/models/gemnet_oc/gemnet_oc.py 89.68% <100.00%> (-0.23%) ⬇️
src/fairchem/core/models/scn/scn.py 93.56% <100.00%> (ø)
src/fairchem/core/modules/scaling/fit.py 73.80% <100.00%> (ø)
src/fairchem/core/trainers/ocp_trainer.py 69.12% <100.00%> (+0.10%) ⬆️
...e/models/equiformer_v2/equiformer_v2_deprecated.py 91.15% <0.00%> (ø)
src/fairchem/core/trainers/base_trainer.py 87.52% <83.33%> (-0.40%) ⬇️

... and 2 files with indirect coverage changes

@misko misko requested a review from lbluque September 19, 2024 23:46
lbluque
lbluque previously approved these changes Sep 20, 2024
Copy link
Collaborator

@lbluque lbluque left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lg, just minor suggestions!

src/fairchem/applications/cattsunami/core/ocpneb.py Outdated Show resolved Hide resolved
src/fairchem/core/models/equiformer_v2/layer_norm.py Outdated Show resolved Hide resolved
src/fairchem/core/trainers/ocp_trainer.py Show resolved Hide resolved
lbluque
lbluque previously approved these changes Oct 3, 2024
Copy link
Collaborator

@lbluque lbluque left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thanks @misko !

@lbluque lbluque disabled auto-merge October 16, 2024 22:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request patch Patch version release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants