-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add new MatrixFields module, along with unit tests and performance tests #1326
Conversation
f860625
to
f15ba20
Compare
c7ea45c
to
20d7b3d
Compare
03a4c2e
to
a4c796a
Compare
1334: Extend and improve RecursiveApply, fix DSS inference r=charleskawczynski a=charleskawczynski This PR: - Extends and improves RecursiveApply (incorporates some changes/functionality in #1326) by working with Tuples/NamedTuples in a type-stable way - Adds RecursiveApply tests - Fix DSS inference and corresponding test Co-authored-by: Charles Kawczynski <[email protected]>
1334: Extend and improve RecursiveApply, fix DSS inference r=charleskawczynski a=charleskawczynski This PR: - Extends and improves RecursiveApply (incorporates some changes/functionality in #1326) by working with Tuples/NamedTuples in a type-stable way - Adds RecursiveApply tests - Fix DSS inference and corresponding test Co-authored-by: Charles Kawczynski <[email protected]>
f68577d
to
6265bdf
Compare
Couple of updates:
|
bors try |
tryBuild failed: |
6265bdf
to
44e927c
Compare
bors try |
tryBuild failed: |
681faf7
to
c8343c5
Compare
Also, it looks like the new invalidations are due to adding BandedMatrices.jl as a dependency to ClimaCore, because they remain unchanged when I comment out the new code in the |
I'm broadly okay with this, but it is pretty complex, so it could do with a few more signposts so that others can find their way around. Could you expand the docs a bit to explain how it works? |
c8343c5
to
ae9e78e
Compare
@simonbyrne @charleskawczynski I've added a new |
bors try |
bors try- |
ae9e78e
to
684e0a8
Compare
bors try |
tryBuild failed: |
684e0a8
to
e50d2c9
Compare
bors try |
tryBuild succeeded! The publicly hosted instance of bors-ng is deprecated and will go away soon. If you want to self-host your own instance, instructions are here. If you want to switch to GitHub's built-in merge queue, visit their help page. |
e50d2c9
to
eb42833
Compare
eb42833
to
e71b858
Compare
bors r+ |
1326: Add new MatrixFields module, along with unit tests and performance tests r=dennisYatunin a=dennisYatunin # Purpose This PR encompasses the first two tasks of the implicit interface updates [SDI](#1230). That is, it refactors `StencilCoefs` into `BandMatrixRow` and `ApplyStencil`/`ComposeStencil` into `MultiplyColumnwiseBandMatrixField`, and it also adds a comprehensive set of tests. In the process of refactoring, the code for low-level matrix operations has been simplified and useful new functionality has been added. In order to avoid conflicts with pre-existing code, the new data structures and functions have been placed in a separate module called `MatrixFields`; this module can be removed once we are ready to replace all of the old code. ## Content ### API improvements - Matrix field rows now have simple constructors (e.g., `DiagonalMatrixRow` and `TridiagonalMatrixRow`) that can be used to construct matrix fields on the fly in broadcast expressions. - Matrix fields are now automatically promoted to common types for basic arithmetic operations. This involves promoting the entries in the rows, as well as padding rows with zeros when rows have unequal numbers of entries. In addition, `LinearAlgebra.I` (or, rather, `LinearAlgebra.UniformScaling`) is automatically promoted to a `DiagonalMatrixRow` when used with `BandMatrixRow`s. So, for example, it is now possible to evaluate `DiagonalMartrixRow(1) - 2 * TridiagonalMatrixRow(3, 4, 5) - I / 6`. - Matrix fields are now fully integrated with `RecursiveApply`, which allows a single matrix field to store multiple matrices by using `Tuple`s or `NamedTuple`s as entries instead of scalars. This is achieved by using the new `rzero` function instead of `zero`, and also by making `+`, `-`, `*`, and `/` call `radd`, `rsub`, `rmul`, and `rdiv` when used on `BandMatrixRow`s. This change will make it possible to, e.g., store the derivatives of all tracer tendencies with respect to vertical velocity as a single matrix field in `ClimaAtmos`, which will lead to simpler code and better performance. - All matrix-matrix and matrix-vector multiplications can now be handled using the `⋅` operator, which is an alias for `MultiplyColumnwiseBandMatrixField()`. The implementation of this operator is much simpler than the implementation of `ApplyStencil`/`ComposeStencil`, and it is effectively localized to the single function `multiply_matrix_at_index`. In order to avoid inference failures for complex matrix field broadcast expressions, the recursion limit for this function needed to be disabled. Unfortunately, just as with the pre-existing code, inlining this function with ``@propagate_inbounds`` drastically increased compilation time, and it also caused inference failures for particularly complicated matrix broadcast expressions. Although not using ``@propagate_inbounds`` causes a 3--4x slowdown in the evaluation of matrix broadcast expressions, it avoids a 10--100x slowdown in their compilation. - The matrix of the divergence operator can now be properly represented as a matrix of covectors, rather than a matrix of scalars. Since the divergence operator turns vectors into scalars, multiplying its matrix by a matrix/vector of vectors should return a matrix/vector of scalars. However, this is not as simple as just multiplying a covector by a vector and returning a scalar, since the vector may need to be projected onto the dual axis of the covector before the multiplication. So, matrix-matrix and matrix-vector multiplication is now implemented using the `rmul_with_projection` function instead of `rmul`. In general, this function works just like `rmul`, but, when the first argument is a covector (an `AdjointAxisVector`) and the second is a vector or higher-order tensor (an `AxisTensor`), it uses the local geometry to project the second argument before the multiplication. In the future, we may want to generalize this to higher-order cotensors. - The `MatrixFields` module also includes functions for converting matrix fields to more conventional arrays that are compatible with `LinearAlgebra.jl`. The function `column_field2array` converts a matrix field defined on a column space to a `BandedMatrix` from `BandedMatrices.jl`, and it also converts a regular field defined on a column space to a `Vector`. The function `field2arrays` works for fields defined on multiple columns by calling `column_field2array` on each column. - Matrix fields now get displayed in the REPL as actual matrices. Specifically, when a matrix field contains `Number` entries, its first column is passed to `column_field2array_view`, which is identical to `column_field2array`, except that it allocates less memory by returning a view of the field's underlying data. Although `column_field2array` and `column_field2array_view` work for matrix fields with `Tuple`/`NamedTuple` entries, the `show` method for a `BandedMatrix` crashes when displaying such entries. In addition, this `show` method results in somewhat unreadable output for other struct entries, like vectors and covectors. So, all non-`Number` matrix fields still get displayed in the REPL like regular fields. ### Related ClimaCore Changes - To ensure that promotion and conversion of `BandMatrixRow` entries works correctly for nested `Tuple`/`NamedTuple` entries, the `rpromote_type` and `rconvert` functions have been added to RecursiveApply. The first function uses `rmaptype` to recursively call `promote_type`, and the second function uses `rmap` and `rzero` to make a more type-stable version of Julia's built-in `convert` function. - To support manipulating matrices of covectors (and, in the future, matrices of higher-order cotensors), all basic arithmetic operations have been defined for `AdjointAxisTensor`. This includes `+`, `-`, `*`, `/`, `\`, and `==`, as well as the `zero` function. The new methods undo the adjoint and fall back to the methods for `AxisTensor`. - To simplify the setup of the non-scalar matrix tests, `map` has been generalized from only working for a single `Field` to working for multiple `Field`s. ### Tests - All of the following test files check for correctness, allocations, and type instabilities. - The file `test/MatrixFields/band_matrix_row.jl` tests low-level operations with `BandMatrixRow`s, ensuring that constructors, conversions, arithmetic operations, and combinations with `LinearAlgebra.I` work for numbers and nested `Tuple`/`NamedTuple` values. - The file `test/MatrixFields/rmul_with_projection.jl` tests `rmul_with_projection` for all currently supported combinations of scalars, vectors, tensors, and covectors, as well as several combinations of nested `Tuple`/`NamedTuple` values. - The file `test/MatrixFields/field2array.jl` checks that `column_field2array`, `column_field2array_view`, and `field2arrays` work as expected for very simple matrix fields. - The file `test/MatrixFields/matrix_field_broadcasting.jl` tests a wide range of scalar and non-scalar matrix field broadcast expressions. It compares each scalar matrix field broadcast expression against an equivalent implementation using `field2arrays` and LinearAlgebra's `mul!`, making sure that the matrix field implementation is roughly as performant as the array implementation. It also compares each non-scalar matrix field broadcast expression (e.g., an expression with matrices of covectors or `NamedTuple`s) against an equivalent scalar matrix field implementation. - I have also checked that the new implementation is at least as performant as the pre-existing implementation for those cases that the pre-existing API can support. For example, to compare the two implementations for the "diagonal matrix times bi-diagonal matrix times tri-diagonal matrix times quad-diagonal matrix times vector" test case, run the following code after `test/MatrixFields/matrix_field_broadcasting.jl`: ``` using ClimaCore: MatrixFields, Operators function compare_to_old_implementation() ᶜᶜmat, ᶜᶠmat, ᶠᶠmat, ᶠᶜmat, ᶜvec, ᶠvec = random_test_fields(Float64) result = `@.` ᶜᶜmat ⋅ ᶜᶠmat ⋅ ᶠᶠmat ⋅ ᶠᶜmat ⋅ ᶜvec inputs = (ᶜᶜmat, ᶜᶠmat, ᶠᶠmat, ᶠᶜmat, ᶜvec) func!(result, ᶜᶜmat, ᶜᶠmat, ᶠᶠmat, ᶠᶜmat, ᶜvec) = `@.` result = ᶜᶜmat ⋅ ᶜᶠmat ⋅ ᶠᶠmat ⋅ ᶠᶜmat ⋅ ᶜvec time, allocs = `@benchmark` call_func(func!, result, inputs) to_old_version(row::BMR) where {BMR <: MatrixFields.BandMatrixRow} = Operators.StencilCoefs{MatrixFields.outer_diagonals(BMR)...}(row.entries) apply = Operators.ApplyStencil() compose = Operators.ComposeStencils() ᶜᶜmat_old = to_old_version.(ᶜᶜmat) ᶜᶠmat_old = to_old_version.(ᶜᶠmat) ᶠᶠmat_old = to_old_version.(ᶠᶠmat) ᶠᶜmat_old = to_old_version.(ᶠᶜmat) result_old = `@.` apply(compose(compose(compose(ᶜᶜmat_old, ᶜᶠmat_old), ᶠᶠmat_old), ᶠᶜmat_old), ᶜvec) inputs_old = (ᶜᶜmat_old, ᶜᶠmat_old, ᶠᶠmat_old, ᶠᶜmat_old, ᶜvec) func!_old(result_old, ᶜᶜmat_old, ᶜᶠmat_old, ᶠᶠmat_old, ᶠᶜmat_old, ᶜvec) = `@.` result_old = apply(compose(compose(compose(ᶜᶜmat_old, ᶜᶠmat_old), ᶠᶠmat_old), ᶠᶜmat_old), ᶜvec) time_old, allocs_old = `@benchmark` call_func(func!_old, result_old, inputs_old) `@assert` allocs == allocs_old == 0 `@info` "New implementation time: $time s" `@info` "Old implementation time: $time_old s" end compare_to_old_implementation() ``` This code prints out the following: ``` [ Info: New implementation time: 1.5848e-5 s [ Info: Old implementation time: 1.5283e-5 s ``` Co-authored-by: Dennis Yatunin <[email protected]>
Build failed: |
bors r+ |
Build succeeded! The publicly hosted instance of bors-ng is deprecated and will go away soon. If you want to self-host your own instance, instructions are here. If you want to switch to GitHub's built-in merge queue, visit their help page. |
Purpose
This PR encompasses the first two tasks of the implicit interface updates SDI. That is, it refactors
StencilCoefs
intoBandMatrixRow
andApplyStencil
/ComposeStencil
intoMultiplyColumnwiseBandMatrixField
, and it also adds a comprehensive set of tests. In the process of refactoring, the code for low-level matrix operations has been simplified and useful new functionality has been added. In order to avoid conflicts with pre-existing code, the new data structures and functions have been placed in a separate module calledMatrixFields
; this module can be removed once we are ready to replace all of the old code.Content
API improvements
DiagonalMatrixRow
andTridiagonalMatrixRow
) that can be used to construct matrix fields on the fly in broadcast expressions.LinearAlgebra.I
(or, rather,LinearAlgebra.UniformScaling
) is automatically promoted to aDiagonalMatrixRow
when used withBandMatrixRow
s. So, for example, it is now possible to evaluateDiagonalMartrixRow(1) - 2 * TridiagonalMatrixRow(3, 4, 5) - I / 6
.RecursiveApply
, which allows a single matrix field to store multiple matrices by usingTuple
s orNamedTuple
s as entries instead of scalars. This is achieved by using the newrzero
function instead ofzero
, and also by making+
,-
,*
, and/
callradd
,rsub
,rmul
, andrdiv
when used onBandMatrixRow
s. This change will make it possible to, e.g., store the derivatives of all tracer tendencies with respect to vertical velocity as a single matrix field inClimaAtmos
, which will lead to simpler code and better performance.⋅
operator, which is an alias forMultiplyColumnwiseBandMatrixField()
. The implementation of this operator is much simpler than the implementation ofApplyStencil
/ComposeStencil
, and it is effectively localized to the single functionmultiply_matrix_at_index
. In order to avoid inference failures for complex matrix field broadcast expressions, the recursion limit for this function needed to be disabled. Unfortunately, just as with the pre-existing code, inlining this function with@propagate_inbounds
drastically increased compilation time, and it also caused inference failures for particularly complicated matrix broadcast expressions. Although not using@propagate_inbounds
causes a 3--4x slowdown in the evaluation of matrix broadcast expressions, it avoids a 10--100x slowdown in their compilation.rmul_with_projection
function instead ofrmul
. In general, this function works just likermul
, but, when the first argument is a covector (anAdjointAxisVector
) and the second is a vector or higher-order tensor (anAxisTensor
), it uses the local geometry to project the second argument before the multiplication. In the future, we may want to generalize this to higher-order cotensors.MatrixFields
module also includes functions for converting matrix fields to more conventional arrays that are compatible withLinearAlgebra.jl
. The functioncolumn_field2array
converts a matrix field defined on a column space to aBandedMatrix
fromBandedMatrices.jl
, and it also converts a regular field defined on a column space to aVector
. The functionfield2arrays
works for fields defined on multiple columns by callingcolumn_field2array
on each column.Number
entries, its first column is passed tocolumn_field2array_view
, which is identical tocolumn_field2array
, except that it allocates less memory by returning a view of the field's underlying data. Althoughcolumn_field2array
andcolumn_field2array_view
work for matrix fields withTuple
/NamedTuple
entries, theshow
method for aBandedMatrix
crashes when displaying such entries. In addition, thisshow
method results in somewhat unreadable output for other struct entries, like vectors and covectors. So, all non-Number
matrix fields still get displayed in the REPL like regular fields.Related ClimaCore Changes
BandMatrixRow
entries works correctly for nestedTuple
/NamedTuple
entries, therpromote_type
andrconvert
functions have been added to RecursiveApply. The first function usesrmaptype
to recursively callpromote_type
, and the second function usesrmap
andrzero
to make a more type-stable version of Julia's built-inconvert
function.AdjointAxisTensor
. This includes+
,-
,*
,/
,\
, and==
, as well as thezero
function. The new methods undo the adjoint and fall back to the methods forAxisTensor
.map
has been generalized from only working for a singleField
to working for multipleField
s.Tests
test/MatrixFields/band_matrix_row.jl
tests low-level operations withBandMatrixRow
s, ensuring that constructors, conversions, arithmetic operations, and combinations withLinearAlgebra.I
work for numbers and nestedTuple
/NamedTuple
values.test/MatrixFields/rmul_with_projection.jl
testsrmul_with_projection
for all currently supported combinations of scalars, vectors, tensors, and covectors, as well as several combinations of nestedTuple
/NamedTuple
values.test/MatrixFields/field2array.jl
checks thatcolumn_field2array
,column_field2array_view
, andfield2arrays
work as expected for very simple matrix fields.test/MatrixFields/matrix_field_broadcasting.jl
tests a wide range of scalar and non-scalar matrix field broadcast expressions. It compares each scalar matrix field broadcast expression against an equivalent implementation usingfield2arrays
and LinearAlgebra'smul!
, making sure that the matrix field implementation is roughly as performant as the array implementation. It also compares each non-scalar matrix field broadcast expression (e.g., an expression with matrices of covectors orNamedTuple
s) against an equivalent scalar matrix field implementation.test/MatrixFields/matrix_field_broadcasting.jl
: