-
Notifications
You must be signed in to change notification settings - Fork 122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add initial support for diff of ref return types in rev mode #425
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
parth-07
force-pushed
the
rev-ref-returns
branch
from
April 27, 2022 14:34
cc098b2
to
74904d8
Compare
This was referenced Jul 20, 2023
Merged
The work in this PR was rebased on top of master and merged as part of #601 |
infinite-void-16
added a commit
to infinite-void-16/clad
that referenced
this pull request
Aug 10, 2024
This commit adds support for custom (user-provided) `_forw` functions. A `_forw` function, if available, is called in place of the actual function. For example, if the primal code contains: ```cpp someFn(u, v, w); ``` and user has defined a custom `_forw` function for `someFn` as follows: ```cpp namespace clad { namespace custom_derivatives { void someFn_forw(double u, double v, double w, double *d_u, double *d_v, double *dw) { // ... // ... } } } ``` Then clad will generate the derivative function as follows: ```cpp // forward-pass clad::custom_derivatives::someFn_forw(u, v, w, d_u, d_v, d_w); // ... // reverse-pass; no change in reverse-pass someFn_pullback(u, v, w, d_u, d_v, d_w); // ... ``` But more importantly, why do we need such a functionality? Two reasons: - Supporting reference/pointer return types in the reverse-mode. This has been discussed at great length here: vgvassilev#425 (vgvassilev#425) - Supporting types whose elements grows dynamically, such as `std::vector` and `std::map`. The issue is that we correctly need to update the size/property of the adjoint variable when a function call updates the size/property of the corresponding primal variable. However, the actual function call does not modify the adjoint variable. Here comes `_forw` functions to the rescue. `_forw` functions makes it possible to adjust the adjoint variable size/properties along with executing the code of the actual function call.
infinite-void-16
added a commit
to infinite-void-16/clad
that referenced
this pull request
Aug 10, 2024
This commit adds support for custom (user-provided) `_forw` functions. A `_forw` function, if available, is called in place of the actual function. For example, if the primal code contains: ```cpp someFn(u, v, w); ``` and user has defined a custom `_forw` function for `someFn` as follows: ```cpp namespace clad { namespace custom_derivatives { void someFn_forw(double u, double v, double w, double *d_u, double *d_v, double *dw) { // ... // ... } } } ``` Then clad will generate the derivative function as follows: ```cpp // forward-pass clad::custom_derivatives::someFn_forw(u, v, w, d_u, d_v, d_w); // ... // reverse-pass; no change in reverse-pass someFn_pullback(u, v, w, d_u, d_v, d_w); // ... ``` But more importantly, why do we need such a functionality? Two reasons: - Supporting reference/pointer return types in the reverse-mode. This has been discussed at great length here: vgvassilev#425 (vgvassilev#425) - Supporting types whose elements grows dynamically, such as `std::vector` and `std::map`. The issue is that we correctly need to update the size/property of the adjoint variable when a function call updates the size/property of the corresponding primal variable. For example: a call to `vec.push_back(...)` should update the size of `_d_vec` as well. However, the actual function call does not modify the adjoint variable in any way. Here comes `_forw` functions to the rescue. `_forw` functions makes it possible to adjust the adjoint variable size/properties along with executing the actual function call. Please note that `_forw` function signature takes adjoint variables as arguments and return `clad::ValueAndAdjoint<U, V>` to support the reference/pointer return type.
infinite-void-16
added a commit
to infinite-void-16/clad
that referenced
this pull request
Aug 10, 2024
This commit adds support for custom (user-provided) `_forw` functions. A `_forw` function, if available, is called in place of the actual function. For example, if the primal code contains: ```cpp someFn(u, v, w); ``` and user has defined a custom `_forw` function for `someFn` as follows: ```cpp namespace clad { namespace custom_derivatives { void someFn_forw(double u, double v, double w, double *d_u, double *d_v, double *dw) { // ... // ... } } } ``` Then clad will generate the derivative function as follows: ```cpp // forward-pass clad::custom_derivatives::someFn_forw(u, v, w, d_u, d_v, d_w); // ... // reverse-pass; no change in reverse-pass someFn_pullback(u, v, w, d_u, d_v, d_w); // ... ``` But more importantly, why do we need such a functionality? Two reasons: - Supporting reference/pointer return types in the reverse-mode. This has been discussed at great length here: vgvassilev#425 (vgvassilev#425) - Supporting types whose elements grows dynamically, such as `std::vector` and `std::map`. The issue is that we correctly need to update the size/property of the adjoint variable when a function call updates the size/property of the corresponding primal variable. For example: a call to `vec.push_back(...)` should update the size of `_d_vec` as well. However, the actual function call does not modify the adjoint variable in any way. Here comes `_forw` functions to the rescue. `_forw` functions makes it possible to adjust the adjoint variable size/properties along with executing the actual function call. Please note that `_forw` function signature takes adjoint variables as arguments and return `clad::ValueAndAdjoint<U, V>` to support the reference/pointer return type.
infinite-void-16
added a commit
to infinite-void-16/clad
that referenced
this pull request
Aug 10, 2024
This commit adds support for custom (user-provided) `_forw` functions. A `_forw` function, if available, is called in place of the actual function. For example, if the primal code contains: ```cpp someFn(u, v, w); ``` and user has defined a custom `_forw` function for `someFn` as follows: ```cpp namespace clad { namespace custom_derivatives { void someFn_forw(double u, double v, double w, double *d_u, double *d_v, double *dw) { // ... // ... } } } ``` Then clad will generate the derivative function as follows: ```cpp // forward-pass clad::custom_derivatives::someFn_forw(u, v, w, d_u, d_v, d_w); // ... // reverse-pass; no change in reverse-pass someFn_pullback(u, v, w, d_u, d_v, d_w); // ... ``` But more importantly, why do we need such a functionality? Two reasons: - Supporting reference/pointer return types in the reverse-mode. This has been discussed at great length here: vgvassilev#425 (vgvassilev#425) - Supporting types whose elements grows dynamically, such as `std::vector` and `std::map`. The issue is that we correctly need to update the size/property of the adjoint variable when a function call updates the size/property of the corresponding primal variable. For example: a call to `vec.push_back(...)` should update the size of `_d_vec` as well. However, the actual function call does not modify the adjoint variable in any way. Here comes `_forw` functions to the rescue. `_forw` functions makes it possible to adjust the adjoint variable size/properties along with executing the actual function call. Please note that `_forw` function signature takes adjoint variables as arguments and return `clad::ValueAndAdjoint<U, V>` to support the reference/pointer return type.
infinite-void-16
added a commit
to infinite-void-16/clad
that referenced
this pull request
Aug 13, 2024
This commit adds support for custom (user-provided) `_forw` functions. A `_forw` function, if available, is called in place of the actual function. For example, if the primal code contains: ```cpp someFn(u, v, w); ``` and user has defined a custom `_reverse_forw` function for `someFn` as follows: ```cpp namespace clad { namespace custom_derivatives { void someFn_reverse_forw(double u, double v, double w, double *d_u, double *d_v, double *dw) { // ... // ... } } } ``` Then clad will generate the derivative function as follows: ```cpp // forward-pass clad::custom_derivatives::someFn_reverse_forw(u, v, w, d_u, d_v, d_w); // ... // reverse-pass; no change in reverse-pass someFn_pullback(u, v, w, d_u, d_v, d_w); // ... ``` But more importantly, why do we need such a functionality? Two reasons: - Supporting reference/pointer return types in the reverse-mode. This has been discussed at great length here: vgvassilev#425 (vgvassilev#425) - Supporting types whose elements grows dynamically, such as `std::vector` and `std::map`. The issue is that we correctly need to update the size/property of the adjoint variable when a function call updates the size/property of the corresponding primal variable. For example: a call to `vec.push_back(...)` should update the size of `_d_vec` as well. However, the actual function call does not modify the adjoint variable in any way. Here comes `_forw` functions to the rescue. `_forw` functions makes it possible to adjust the adjoint variable size/properties along with executing the actual function call. Please note that `_reverse_forw` function signature takes adjoint variables as arguments and return `clad::ValueAndAdjoint<U, V>` to support the reference/pointer return type.
infinite-void-16
added a commit
to infinite-void-16/clad
that referenced
this pull request
Aug 19, 2024
This commit adds support for custom (user-provided) `_forw` functions. A `_forw` function, if available, is called in place of the actual function. For example, if the primal code contains: ```cpp someFn(u, v, w); ``` and user has defined a custom `_reverse_forw` function for `someFn` as follows: ```cpp namespace clad { namespace custom_derivatives { void someFn_reverse_forw(double u, double v, double w, double *d_u, double *d_v, double *dw) { // ... // ... } } } ``` Then clad will generate the derivative function as follows: ```cpp // forward-pass clad::custom_derivatives::someFn_reverse_forw(u, v, w, d_u, d_v, d_w); // ... // reverse-pass; no change in reverse-pass someFn_pullback(u, v, w, d_u, d_v, d_w); // ... ``` But more importantly, why do we need such a functionality? Two reasons: - Supporting reference/pointer return types in the reverse-mode. This has been discussed at great length here: vgvassilev#425 (vgvassilev#425) - Supporting types whose elements grows dynamically, such as `std::vector` and `std::map`. The issue is that we correctly need to update the size/property of the adjoint variable when a function call updates the size/property of the corresponding primal variable. For example: a call to `vec.push_back(...)` should update the size of `_d_vec` as well. However, the actual function call does not modify the adjoint variable in any way. Here comes `_forw` functions to the rescue. `_forw` functions makes it possible to adjust the adjoint variable size/properties along with executing the actual function call. Please note that `_reverse_forw` function signature takes adjoint variables as arguments and return `clad::ValueAndAdjoint<U, V>` to support the reference/pointer return type.
vgvassilev
pushed a commit
that referenced
this pull request
Aug 20, 2024
This commit adds support for custom (user-provided) `_forw` functions. A `_forw` function, if available, is called in place of the actual function. For example, if the primal code contains: ```cpp someFn(u, v, w); ``` and user has defined a custom `_reverse_forw` function for `someFn` as follows: ```cpp namespace clad { namespace custom_derivatives { void someFn_reverse_forw(double u, double v, double w, double *d_u, double *d_v, double *dw) { // ... // ... } } } ``` Then clad will generate the derivative function as follows: ```cpp // forward-pass clad::custom_derivatives::someFn_reverse_forw(u, v, w, d_u, d_v, d_w); // ... // reverse-pass; no change in reverse-pass someFn_pullback(u, v, w, d_u, d_v, d_w); // ... ``` But more importantly, why do we need such a functionality? Two reasons: - Supporting reference/pointer return types in the reverse-mode. This has been discussed at great length here: #425 (#425) - Supporting types whose elements grows dynamically, such as `std::vector` and `std::map`. The issue is that we correctly need to update the size/property of the adjoint variable when a function call updates the size/property of the corresponding primal variable. For example: a call to `vec.push_back(...)` should update the size of `_d_vec` as well. However, the actual function call does not modify the adjoint variable in any way. Here comes `_forw` functions to the rescue. `_forw` functions makes it possible to adjust the adjoint variable size/properties along with executing the actual function call. Please note that `_reverse_forw` function signature takes adjoint variables as arguments and return `clad::ValueAndAdjoint<U, V>` to support the reference/pointer return type.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR adds initial support for correctly differentiating function calls with reference return types. The main motivation for adding this support is many of the class operator overloads, such as,
=
,+=
,-=
,*=
etc, naturally return a reference to the class object.The work done in this PR develops a base that would be extended upon for adding many other functionalities as well -- most notably, differentiation of function calls with pointer return types and maintaining the correct stack of pointer derivatives when pointers are passed to a call expression.
What does this PR solves?
C++ has the functionality to declare a variable as a reference (an alias) to an already existing variable, object or function.
From the mathematical point of view, reference declaration is a no-operation. We are just defining a new name for an already existing variable. What I am trying to emphasize here is, unlike a normal variable declaration, a reference declaration should not have any corresponding reverse pass statements. With that being said, a reference variable declaration does impact the differentiation of further operations.
For example:
Statements described in (1) and (2) are mathematically identical. Thus they should produce derived statements that have the same behaviour as well. In Clad, we solve this by effectively using the same derivative variable for both the original variable and the reference variable. To put things more concretely, please consider the following code snippet:
double& a_ref = a;
This statement produces the following statements in the derived function:
Please note here that the derivative of
a_ref
is a reference variable pointing to the derivative ofa
.In the example that we just discussed we can easily point
_d_a_ref
to_d_a
because the derivative ofa
is known at compile time. This is not always the case, for example, consider the following code:We cannot determine which variable
ref
is referencing at compile time. Thus, we also cannot determine which derivative should_d_ref
refer to.This PR provides functionality to correctly point
_d_ref
to the derivative of the variable to whichref
refers when this variable is not known at compile time.How does this PR solves this problem?
This PR solves the problem of correctly setting the derivative of a reference variable when the reference variable is being assigned to the result of a call expression by modifying the primal call expression such that it returns both the primal computation and the adjoint information.
When Clad is differentiating a call to the function
someFn
that returns a reference then Clad generates a new functionsomeFn_forw
by transforming the original functionsomeFn
such that it takes adjoint information as input parameters and returns both the primal value and the adjoint information. For the remainder of this discussion, I will refer to this transformation mode as Reverse Mode Forward Pass mode. Please suggest a better and more intuitive name for this transformation mode. For example, consider the following function:The corresponding
someFn_forw
will be as follows:Therefore, the following statement:
double& ref = someFn(i, j);
will produce the following statements in the derived function:
Please note that pullback or
dfdx()
should be zero-tangent when the return type is a reference value. This will be discussed later in the next section.Problems and design decisions
ReverseModeForwPassVisitor
Currently, I have created a new visitor class
ReverseModeForwPassVisitor
that inherits fromReverseModeVisitor
and is responsible for creating forward pass functions (_forw
functions). Please note that the transformation to generate a forward pass function only requires a tiny subset of functionalities of Reverse Mode. To be precise, required functionality of reverse mode are: correctly initialised variable declaration for derivative of each local variable, and forward pass should effectively trace the primal function.Benefits of inheriting from
ReverseModeVisitor
:ReverseModeVisitor
, discard the reverse pass and only provide the implementation ofVisit*
functions that are different inReverseModeForwPass
mode such asVisitReturnStmt
then we can avoid too much code duplication. If we don't inherit fromReverseModeVisitor
then we would need to provide the implementation for all of theVisit*
functions. In most of these functions, we would simply be cloning the AST nodes.Disadvantages of inheriting from
ReverseModeVisitor
:ReverseModeVisitor
, then we would not be able to generate Reverse Mode Forward Pass transformed functions of any function that cannot be differentiated in Reverse Mode, for example, due to the lack of support of some C++ construct. This is problematic because it is a lot easier to generate Reverse Mode Forward Pass transformed functions then it is to perform complete Reverse Mode differentiation, as most of the C++ features and constructs simply need to be cloned in Reverse Mode Forward Pass.ReverseModeVisitor
implicitly assumes that forward pass of Reverse Mode derived function has effectively the same behaviour as the primal function. In the future, if we add active variable and data flow analysis in Reverse Mode, and we would be adding it soon, then this assumption would not be true._forw
functions for adding support of other functionalities as well. Many of these functionalities may require transformations that diverge from the functionalities included in Reverse Mode forward pass, and thus we may need to provide separate implementation of more and moreVisit*
functions.Derivative expressions to be used in Reverse Mode forward pass
Until now, we only needed to use derivative variables in the reverse pass. With the introduction of
_forw
functions, we need to use derivative variables in forward pass too. Since,_forw
function takes adjoint information as well.The problem here is,
Visit
function is designed to returnStmtDiff
that consists of a clone of the original expression and corresponding derivative if it exists. Sometimes, the clone containsclad::push(...)
expression and derivative containclad::pop(...)
expression. In these cases, the derivative is designed to be used in the reverse pass only -- a notable example being array expressions in loops. Therefore, we need a routine that allows us to conveniently build derivative expressions that can be used in the forward pass. The building of such a routine will be non-trivial becauseVisit*
function is tightly coupled with how derivative expressions are obtained.Pullback function signature of functions with reference return types
For a function
fn
:The pullback function should have the signature as follows:
For functions with reference return types, the situation is slightly more complicated.
_d_y
or the pullback is used to initialise/correctly set the return value derivative. For example:return val;
Inside a pullback function, the following code gets differentiated to:
// reverse pass _d_val += _d_y;
Intuitively, this behaviour can be reasoned as follows:
If the return value of the function
someFn
isval
, then the statementy = someFn(i, j)
can effectively be visualised as follows:Now, if
y
is a reference variable, thendouble& y = val
becomes a no-operation and should have no corresponding reverse mode derived statement. Therefore, ideally, pullback functions with reference return types do not have any corresponding pullback value. Again, sorry for the confusing terminology. We have two ways to proceed from here:_d_y
) value. The dummy value should be equal to zero-tangent vector of the return type.Or
void
return types. This way, we would not need to pass any dummy value while calling pullback functions. On the downside, this makes rules of pullback function signature slightly more complicated.Please give your reviews and suggestions on the approach used, the problems and the design decisions discussed here.