Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Differences with ImplicitDifferentiation.jl? #4

Closed
gdalle opened this issue Oct 17, 2022 · 12 comments
Closed

Differences with ImplicitDifferentiation.jl? #4

gdalle opened this issue Oct 17, 2022 · 12 comments

Comments

@gdalle
Copy link

gdalle commented Oct 17, 2022

Hey there, and congrats on the package!
Could we take some time to reflect on the differences between your work and https://github.com/gdalle/ImplicitDifferentiation.jl, which I recently developed? I feel like they have similar goals, and maybe we could work together to avoid duplicates?

@andrewning
Copy link
Member

Would be happy to. Your package looks great! Not trying to create duplicates, just needed something for our lab needs where we run into this scenario quite a bit. I think the main difference is that we use ForwardDiff a lot so this package focuses more on that. Whereas I think yours targets AD packages that are ChainRules compatible (which would be useful to us down the road). Looks like yours also has support for lazy operators which is nice. I also added some functionality for adding custom rules - mainly to support one of our collaborators who needs to call some python code for a sub function - and would be using finite differencing which we'd inject back into the AD chain. Doesn't really have anything to do with implicit differentiation, but reused some of the functionality. Definitely open to working together.

@gdalle
Copy link
Author

gdalle commented Oct 18, 2022

Adding ForwardDiff compatibility is definitely among our short term goals, perhaps with the help of https://github.com/ThummeTo/ForwardDiffChainRules.jl.
It would also be interesting to discuss the special needs of your lab, cause that might enlighten us about some user expectations that we might have missed :)

@taylormcd
Copy link
Contributor

After looking at the current state of both packages I think the primary differences are:

  • ImplicitAD supports ForwardDiff and ReverseDiff
  • ImplicitDifferentiation supports forward and reverse-mode ChainRules compatible AD packages, while ImplicitAD supports only reverse-mode ChainRules compatible AD packages.
  • ImplicitAD supports using arbitrary linear solvers with user-defined jacobians (this is useful because functions for these jacobians are sometimes already defined since they are often used to solve nonlinear system of equations).
  • ImplicitDifferentiation supports only iterative linear solvers, since it doesn't materialize the jacobian.
  • ImplicitAD provides specialized methods for linear systems of equations

Overall, it seems like ImplicitDifferentiation is designed to be efficient for very large implicit systems, while the default settings for ImplicitAD are more appropriate for smaller systems of equations. That being said, with the right arguments, ImplicitAD can be extended to handle large systems of equations efficiently as well. I believe it is even possible to adopt a theoretically identical approach as ImplicitDifferentiation if the right inputs are provided. ImplicitAD therefore appears to be the more generic of the two packages at the moment, though whether the interface provided by ImplicitDifferentiation or ImplicitAD is better is debatable.

@andrewning
Copy link
Member

@taylormcd I thought you could use non-iterative linear solvers with LinearOperator.jl. I haven't actually tried either package so not totally sure.
Here is another package: https://julianonconvex.github.io/Nonconvex.jl/stable/gradients/implicit/ Looks pretty similar. Not sure what all the differences are.
The reality is implicit differentiation is relatively straightforward, so not surprising to find it in a few places, and any one of these three (perhaps there are also others?) could be brought to feature-parity pretty quickly. Though it's not necessarily a bad thing to have multiple packages with different emphases/approaches, at least until things mature more. I wouldn't be surprised if future AD packages bake in equivalent functionality.

@taylormcd
Copy link
Contributor

taylormcd commented Oct 24, 2022

I agree that any one of the three could be brought up to feature parity, I just wanted to present a general overview of the current status of the two packages. With regard to the use of LinearOperator.jl, you have to materialize a matrix in order to factorize it and do a non-iterative linear solve. A matrix multiplication linear operator therefore only works for iterative linear solvers.

The implementation in Nonconvex.jl (which ImplicitDifferentiation.jl appears to be based on) seems to be pretty well put together. ForwardDiff support should be possible using the ForwardDiff_frule macro defined in the same package. ReverseDiff support should be possible using the ReverseDiff.@grad_from_chainrules macro. Considering these capabilities, I think the only features in this package not provided by Nonconvex.jl is the functionality provided by the implicit_linear and provide_rule functions.

@taylormcd
Copy link
Contributor

Actually it seems like a frule hasn't been defined in Nonconvex so that would need to be implemented before ForwardDiff support is added.

@taylormcd
Copy link
Contributor

It also seems like ReverseDiff.@grad_from_chainrules doesn't work on the implementation in NonconvexUtils.jl either.

@gdalle
Copy link
Author

gdalle commented Oct 26, 2022

After looking at the current state of both packages I think the primary differences are:

Thank you for the careful review!

ImplicitAD supports using arbitrary linear solvers with user-defined jacobians

One of our projects is to add an option whereby the forward solver actually returns the Jacobian in addition to the solution, in order to save one call to AD

ImplicitDifferentiation supports only iterative linear solvers, since it doesn't materialize the jacobian.

That's completely correct, and it's one of our main design decisions (which also makes the implementation slightly nightmarish)

Overall, it seems like ImplicitDifferentiation is designed to be efficient for very large implicit systems, while the default settings for ImplicitAD are more appropriate for smaller systems of equations.

Sounds like a good summary, I'll add a link to your package in our docs :)

@andrewning
Copy link
Member

This package also works with iterative solvers (someone else in our lab is using ImplicitAD this way). It’s just not the default. Have to make use of the keyword arguments.

@andrewning
Copy link
Member

Getting back to working on this package...I'll add a summary/link to your package later today.

@mohamed82008
Copy link

mohamed82008 commented Sep 14, 2023

Hi! Main developer of Nonconvex.jl and contributor to ImplicitDifferentiation.jl here. I just found this package on JuliaHub and saw this discussion. Cool package!

To give a bit of history, Nonconvex.jl probably has the oldest implementation of generic implicit AD in Julia (https://discourse.julialang.org/t/ann-differentiable-implicit-functions-in-julia-optimisation-nonlinear-solves-and-fixed-point-iterations/76016). Specific implicit functions had AD rules defined in SciML and other repos before Nonconvex.jl but these were not doing generic implicit AD.

ImplicitDifferentiation (ID) is @gdalle's work which was initially loosely based on the Nonconvex.jl implementation with the goal of being better designed, tested and documented. We collaborate on this project although he deserves most of the credit. I think ID 0.5 now has a wide feature coverage including many of the features highlighted above which were missing a few months ago. It might be worth re-examining if we can join forces and figure out better and faster package designs that work for everyone.

@andrewning
Copy link
Member

Thanks for reaching out, and great to hear of the continued progress! I agree with your assessment on the discourse thread, at least we've found that approach quite useful.

To update from our end, we've mostly been working on approaches to alleviate memory issues for long time sequences (e.g., long loops, ODEs). We've added some functionality that really sped up some of the problems we've been working on.

Would be happy to collaborate in areas where we can. We have a couple grants tied to ongoing/future work related to this package.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants