Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changing batchsize (reducing it) will increase the amount of memory and won't free after increasing it back #451

Open
Tracked by #501
NoeTopeza opened this issue Jul 10, 2024 · 4 comments
Assignees
Labels
bug Something isn't working good first issue Good for newcomers

Comments

@NoeTopeza
Copy link
Collaborator

No description provided.

@NoeTopeza NoeTopeza added the bug Something isn't working label Jul 10, 2024
@NoeTopeza
Copy link
Collaborator Author

Changing batchsize (reducing it) will increase the amount of memory the app takes.
Increasing it back doesn't shrink the memory used.

@NoeTopeza
Copy link
Collaborator Author

After further research, this issue seem old (it appear even at tag 12.10 !)

@NoeTopeza
Copy link
Collaborator Author

It may be time to use valgrind or any other other memory monitoring tools

@NoeTopeza NoeTopeza changed the title When batch_size = 1 gpu memory is satured and not freed after updating the batch size Changing batchsize (reducing it) will increase the amount of memory and won't free after increasing it back Jul 19, 2024
@NoeTopeza NoeTopeza added the good first issue Good for newcomers label Jul 26, 2024
@NoeTopeza NoeTopeza pinned this issue Aug 5, 2024
@noTban noTban assigned noTban and simon-riou and unassigned noTban Sep 16, 2024
@simon-riou
Copy link
Collaborator

We noticed a memory leak on a specific computer, while fixing this issue, causing a leak while being in the GUI (without doing anything), it was caused by a too old version of Nvidia Version (535.161.07(Linux)/538.33(Windows)) which had a known issue which get fixed afterwards

https://docs.nvidia.com/datacenter/tesla/tesla-release-notes-535-161-07/index.html :

  • Resolved an issue that caused a memory leak while calling NVML init followed by NVML shutdown in the same thread more than 500 times. 4370862

@noTban noTban mentioned this issue Sep 16, 2024
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

3 participants