-
-
Notifications
You must be signed in to change notification settings - Fork 213
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
slow/high allocation gradient with mapreduce and iterators #1487
Comments
Diffing through non-array iterators is going to be tough in general, because the interface itself is rather constrained yet allows almost arbitrary behaviour from implementations. This is less of a problem for forward-mode ADs like ForwardDiff because they essentially run the same function twice, but per the name reverse-mode ADs have to "reverse" all the operations and that can be tricky to do (let alone performantly). Zygote in particular is not well-equipped here because it works on unoptimized IR, whereas normal iteration code heavily relies on optimizations like inlining to have good performance. We've specialized operations for certain types like arrays because they're a known quantity, but the language doesn't provide us with many tools to do the same for looser types like |
Thanks for the quick response, and for helping me understand the issue! I'll try to figure out how to reformulate my problem without iterators then... |
I think if you're sticking with Zygote, the current approach you have of using operations with eager materialization/implicit vectorization is the way to go. Depending on the use case, it may also be worth trying out other ADs since they'll have different performance characteristics. |
I think this probably boils down to
I'm a bit lazy to write this out, but I'm sure you can replace |
Here's the beginning of the error I get when I try to Zygote gradient the map(sum(...)) approach:
The second suggestion unfortunately can't work in this case, since I need to FWIW, ReverseDiff seems to handle the mapreduce() approach just fine. |
Not sure how you got that, working code is:
Not sure I understand. Your code applies it after
|
Ah, I see that you were referring to the I am trying to get the |
Ok. It's sad that just indexing is still so expensive. There was a ton of code written to do this more efficiently via InplaceableThunks in ChainRules, but it's not used by Zygote (nor anywhere else). What you can do is make something like
Maybe
|
With this formulation, Zygote makes a gradient that is about as fast (with identical allocations) as |
|
oh cool. with that, Zygote is even a bit faster than a taped version with ReverseDiff. thanks. |
I've found that when I compute the gradient of a
mapreduce
expression whose inputs are slices, Zygote generates fast/low-allocation gradients. However, Zygote's gradient for the equivalent expression with iterator inputs is much slower, with much higher allocations. In the MWE below, each iterator result is of homogenous size, so in principle the code to run should be the same. However, I encountered this issue when planning for code where the iterators would be generating results of heterogenous sizes, where my seemingly fast slice/reshape strategy would no longer be feasible. Is this unavoidable? FWIW ForwardDiff's gradients for the two approaches are equally fast/have similar allocation counts.It is possible this is related to #304?
Thanks in advance for any suggestions the Zygote community has here.
Here is an MWE:
The text was updated successfully, but these errors were encountered: