-
Notifications
You must be signed in to change notification settings - Fork 122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reimplement jacobians using the vectorized forward mode #1121
base: master
Are you sure you want to change the base?
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #1121 +/- ##
==========================================
- Coverage 94.44% 94.39% -0.06%
==========================================
Files 50 51 +1
Lines 8770 8849 +79
==========================================
+ Hits 8283 8353 +70
- Misses 487 496 +9
... and 1 file with indirect coverage changes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
struct JacobianDerivedFnTraits<R (C::*)(Args...) cv vol ref noex> { \ | ||
using type = void (C::*)(Args..., SelectLast_t<Args...>) cv vol ref noex; \ | ||
}; | ||
#define JacobianDerivedFnTraits_AddSPECS(var, cv, vol, ref, noex) \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: function-like macro 'JacobianDerivedFnTraits_AddSPECS' used; consider a 'constexpr' template function [cppcoreguidelines-macro-usage]
#define JacobianDerivedFnTraits_AddSPECS(var, cv, vol, ref, noex) \
^
clad::utils::ComputeEffectiveFnName(FD) + "_pushforward"; | ||
callDiff = m_Builder.BuildCallToCustomDerivativeOrNumericalDiff( | ||
customPushforward, customDerivativeArgs, getCurrentScope(), | ||
const_cast<DeclContext*>(FD->getDeclContext())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: do not use const_cast [cppcoreguidelines-pro-type-const-cast]
const_cast<DeclContext*>(FD->getDeclContext()));
^
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
Could we write a few benchmarks to make sure we do not regress in performance? |
ac0c084
to
8b40e97
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
/// Function to define element wise adding of an arrays and an array_ref. | ||
template <typename T, typename U> | ||
CUDA_HOST_DEVICE | ||
array_expression<const array<T>&, BinaryAdd, const array_ref<U>&> | ||
operator+(const array<T>& arr1, const array_ref<U>& arr2) { | ||
assert(arr1.size() == arr2.size()); | ||
return array_expression<const array<T>&, BinaryAdd, const array_ref<U>&>( | ||
arr1, arr2); | ||
} | ||
|
||
/// Function to define element wise adding of an arrays_ref and an array. | ||
template <typename T, typename U> | ||
CUDA_HOST_DEVICE | ||
array_expression<const array_ref<T>&, BinaryAdd, const array<U>&> | ||
operator+(const array_ref<T>& arr1, const array<U>& arr2) { | ||
assert(arr1.size() == arr2.size()); | ||
return array_expression<const array_ref<T>&, BinaryAdd, const array<U>&>( | ||
arr1, arr2); | ||
} | ||
|
||
/// Function to define element wise adding of an arrays and an array_ref. | ||
template <typename T, typename U> | ||
CUDA_HOST_DEVICE | ||
array_expression<const array<T>&, BinarySub, const array_ref<U>&> | ||
operator-(const array<T>& arr1, const array_ref<U>& arr2) { | ||
assert(arr1.size() == arr2.size()); | ||
return array_expression<const array<T>&, BinarySub, const array_ref<U>&>( | ||
arr1, arr2); | ||
} | ||
|
||
/// Function to define element wise adding of an arrays_ref and an array. | ||
template <typename T, typename U> | ||
CUDA_HOST_DEVICE | ||
array_expression<const array_ref<T>&, BinarySub, const array<U>&> | ||
operator-(const array_ref<T>& arr1, const array<U>& arr2) { | ||
assert(arr1.size() == arr2.size()); | ||
return array_expression<const array_ref<T>&, BinarySub, const array<U>&>( | ||
arr1, arr2); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we write tests for these? I think we generally test in Misc
.
operator/(const array_expression<L2, BinOp2, R2>& r) const { | ||
return array_expression<const array_expression<L1, BinOp1, R1>&, BinaryDiv, | ||
const array_expression<L2, BinOp2, R2>&>(*this, r); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Likewise.
for (std::size_t i = 0; i < m_size; i++) | ||
m_arr[i] = arr_exp[i]; | ||
return *this; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Likewise.
// or equal to 0, then log(base) is undefined, and therefore if user only | ||
// requested directional derivative of base^exp w.r.t base -- which is valid | ||
// --, the result would be undefined because as per C++ valid number + NaN * 0 | ||
// = NaN. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this overload?
template <typename T> | ||
CUDA_HOST_DEVICE ValueAndPushforward<T, T> acos_pushforward(T x, T d_x) { | ||
template <typename T, typename dT> | ||
CUDA_HOST_DEVICE ValueAndPushforward<T, dT> acos_pushforward(T x, dT d_x) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these changes not good for a separate PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They can exist on their own but they will have no use. The types don't match in the vectorized fwd mode because those will be T
and clad::array<T>
respectively.
@@ -0,0 +1,20 @@ | |||
#ifndef CLAD_DIFFERENTIATOR_JACOBIANMODEVISITOR_H |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make this a private header?
…orized frw mode and jacobians
Previously, jacobians were based on the non-vectorized reverse mode, which was mostly incapable of capturing multiple outputs. The implementation worked in a few particular cases. For example, it was not possible to differentiate function calls or declare variables inside the original function body.
This PR implements jacobians using the vectorized forward mode. At the very least, this will solve the issues described above and give a way forward to solve other ones. This also means introducing features to the vectorized fwd mode will introduce the same features to jacobians.
Let's take a look at the new signature of jacobians. Since the function to be differentiated is expected to have multiple outputs, we should expect the output in the form of array/pointer/reference parameters (just like before). And for every output parameter, we should generate a corresponding adjoint parameter for the user to acquire the results. Since there is no way to specify which parameters are used as output and which are not, adjoints are generated for all array/pointer/reference parameters. For example:
or
array_ref
is necessary for compatibility with the existing infrastructure for vectorized fwd mode overloads generation.This signature is also similar to the old one. e.g.
However, the behavior differs for multiple output parameters, which the old jacobians did not support.
Note: the same functionality can be achieved by using the vectorized reverse mode, which should probably be implemented at some point. However, the old code for jacobians is unlikely to be useful for that, and there is not much point in keeping it.
Fixes #472