You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
We need a columnwise tridiagonal solver for the GPU: with that, we can plug it into CliMA/ClimaAtmos.jl#1813, which from what I can tell is the main blocker of running the implicit solver on GPUs.
Describe the solution you'd like
I want a function:
column_thomas_solve!(S_column, xᶠ𝕄)
where S_column is a Field of StencilCoefs representing a tridiagonal operator, and xᶠ𝕄 is a scalar Field on the same space, which applies the in-place Thomas algorithm
It should not allocate any additional arrays (i.e. it can mutate both args)
It should work on both CPUs and GPUs
It should work on both columns, 2D extruded fields (CPU only for now) and 3D extruded fields
It needs tests.
The easiest way to do it is to write a function which takes an extra hidx arg and applies it to a single column, e.g.
functioncolumn_thomas_solve!(S_column, xᶠ𝕄, hidx)
# if operating on Fields, use Operators.getidx/Operators.setidx!# if using DataLayouts, use getindex(data, CartesianIndex((i,j,k,v,h))) / setindex!end
then on the CPU you can call it from a loop over columns, or on the GPU make a simple kernel function which calls it:
Is your feature request related to a problem? Please describe.
We need a columnwise tridiagonal solver for the GPU: with that, we can plug it into CliMA/ClimaAtmos.jl#1813, which from what I can tell is the main blocker of running the implicit solver on GPUs.
Describe the solution you'd like
I want a function:
column_thomas_solve!(S_column, xᶠ𝕄)
where
S_column
is aField
ofStencilCoefs
representing a tridiagonal operator, andxᶠ𝕄
is a scalar Field on the same space, which applies the in-place Thomas algorithmThe easiest way to do it is to write a function which takes an extra
hidx
arg and applies it to a single column, e.g.then on the CPU you can call it from a loop over columns, or on the GPU make a simple kernel function which calls it:
The text was updated successfully, but these errors were encountered: