Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Signal killed caused by Adam Offload #164

Open
2 tasks done
MayDomine opened this issue Sep 1, 2023 · 0 comments
Open
2 tasks done

[BUG] Signal killed caused by Adam Offload #164

MayDomine opened this issue Sep 1, 2023 · 0 comments
Labels
bug Something isn't working

Comments

@MayDomine
Copy link
Collaborator

Is there an existing issue for this?

  • I have searched the existing issues

Description of the Bug

When I try to run the finetune script of cpm-live-10b , adam offload optmizer caused the signal killed.And i tried to set the cpu level of adam to 0 which system wont use avx anymore.But it doesn't work.

Environment Information

- GCC version:9.4.0
- Torch version:1.13.0
- Linux system version:Ubuntu 20.04
- CUDA version:11.6+
- Torch's CUDA version (as per `torch.cuda.version()`):1.13 + 11.6
- CPU core: 48

To Reproduce

bash CPM-Bee/src/script/finetune_cpm_bee_10b.sh

Expected Behavior

It works normal

Screenshots

No response

Additional Information

No response

Confirmation

  • I have reviewed and verified all the information provided in this report.
@MayDomine MayDomine added the bug Something isn't working label Sep 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant