-
Notifications
You must be signed in to change notification settings - Fork 657
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix memory leak in zeroGradients() #2792
base: master
Are you sure you want to change the base?
Conversation
--------- Co-authored-by: Administrator <Administrator@tech8> Co-authored-by: KexinFeng <[email protected]>
* Implement PtNDArraryEx.multiboxDetection * MultiboxDetection - code cleanup * MultiboxDetection - code cleanup * MultiboxDetection - code cleanup * MultiboxDetection - code cleanup * format code * Fix, add tests, and pass CI --------- Co-authored-by: Zach Kimberg <[email protected]>
7d61e5c
to
c668fda
Compare
Codecov ReportAttention:
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## master #2792 +/- ##
============================================
+ Coverage 72.08% 72.23% +0.14%
- Complexity 5126 7113 +1987
============================================
Files 473 702 +229
Lines 21970 31682 +9712
Branches 2351 3284 +933
============================================
+ Hits 15838 22884 +7046
- Misses 4925 7236 +2311
- Partials 1207 1562 +355
☔ View full report in Codecov by Sentry. |
If this is a memory leak, wouldn't it be happening in all usages of getGradient? I don't think that is the semantics we were going for with that function. Maybe a better approach might be to cache the gradient as a property of the MxNDArray and PtNDArray. Something like: NDArray getGradient() {
if (this.gradient != null) {
return this.gradient;
}
NDArray gradient = ...
this.gradient = gradient;
return gradient;
} That way, it will fix this memory leak along with other possible gradient ones. As part of this, it may also want to close the gradient NDArray when closing the main NDArray. Although it is likely to be closed anyway by the NDManager. As for the setting part, try with |
Yes, of course! All usages of getGradient() must be closed after... |
But changing semantic of getGradient() now will require to fix all other parts, which already use such behaviour and closes gradients. |
Code in zeroGradients() of NDArray doesn't close gradient tensor after use.
The getGradient() method shares internally memory for each call, but for DJL it creates new handle every time and creates new instance of NDArray. So we must close it after use.
There is some other problem with this code which I don't fixed yet.
Zeroing tensor by subtracting himself is not working for NaNs and Infinities.
So if we receive NaN or Infinity in gradient we will not be possible to recover from that error by calling zeroGradients. We must set array instead of subtracting from it, but semantics of set() methods doesn't allow universally work with NDArray of diffrent shapes and single scallars.
Foe example array.set("*", 0) works for arrays of all shapes, but don't works with single scalar shape without dimensions.
So this commit fixes memory leak only.