Sparse weights in conservative method #49

slevang · 2024-09-20T13:53:19Z

Potential improvement for #42.

The focus on rectilinear grids for this package, and factorization of regridding along dimensions, makes generating and using dense weights feasible. However, the level of sparsity in the weights matrix is still extremely high for any reasonable size grid. I did some experiments converting the weights to a sparse matrix after creation, and am seeing nice improvements both in compute time and memory footprint.

On the example in #42 (comment) I get close to a 4x speedup (and better than xesmf):

CPU times: user 42.5 s, sys: 6.01 s, total: 48.5 s
Wall time: 11.6 s
CPU times: user 6min 9s, sys: 41.6 s, total: 6min 51s
Wall time: 59.2 s

BSchilperoort · 2024-09-20T14:14:55Z

get close to a 4x speedup (and better than xesmf):

Awesome! I did not bother with this originally for the reasons you mentioned. But it's great to see that it's a (relatively easy) way to gain a lot of performance.

Edit:
I don't see any significant performance gain compared to the benchmark I ran from #42...

slevang · 2024-09-21T02:58:10Z

That was a VM with pretty old CPU architecture and I guess just really slow on these particular calculations. I'm also seeing much faster results on my 8 core M1 Mac, about 12s for the skipna=False on current main. With the sparse weights it drops to 5s though. I'm seeing even bigger improvement with skipna=True I think because the sparsity limits the size of the weight array as we track NaNs over each dim.

Mixed results switching between threaded and distributed schedulers, sometimes a bit faster sometimes slower.

pyproject.toml

slevang · 2024-09-21T19:37:20Z

I ran the benchmarking test in #42 across several configurations on my 8 core i7 linux desktop. Runtimes to the nearest second:

chunks={"time": 1}, ~4MB

skipna=False	threads	distributed
sparse	8	14
dense	28	17
xesmf	30	37

skipna=True	threads	distributed
sparse	67	82
dense	327	335
xesmf	55	71

chunks={"time": 10}, ~40MB

skipna=False	threads	distributed
sparse	6	7
dense	13	12
xesmf	7	6

skipna=True	threads	distributed
sparse	59	72
dense	OOM	OOM
xesmf	10	12

Lots of interesting variation. My takeaways:

sparse weights seem to uniformly benefit run time
sparse weights make the NaN tracking scheme over dimensions feasible, otherwise for larger chunks the matrix size blows up
the only case where the distributed scheduler won was with dense weights and small chunks. Keep in mind this is just the defaults so 2 workers and 4 threads on my machine.
with sparse weights, we're on par or better with xesmf, except skipna=True, where xesmf's scheme of simultaneously computing the NaN fraction over all dimensions is much more efficient. This gets washed out with small chunk sizes but is more apparent for larger ones.

BSchilperoort

Thanks for the benchmarks and optimizations! Feel free to merge once you've updated the changelog 🚀

slevang added 2 commits September 20, 2024 03:57

add weights sparsity

69d86e0

Merge branch 'main' into sparse-weights

cdda758

slevang added 2 commits September 20, 2024 14:28

add opt-einsum

7497a12

fix linting

ddc7a42

BSchilperoort reviewed Sep 21, 2024

View reviewed changes

pyproject.toml Outdated Show resolved Hide resolved

slevang added 3 commits September 21, 2024 08:44

make sparse optional

28fead0

shuffle dependency groups

319eb41

add readme note, test other methods with varying chunks

8deb5e4

BSchilperoort previously approved these changes Sep 24, 2024

View reviewed changes

update changelog

784eb98

slevang dismissed BSchilperoort’s stale review via 784eb98 September 24, 2024 13:24

slevang requested a review from BSchilperoort September 24, 2024 13:28

BSchilperoort approved these changes Sep 24, 2024

View reviewed changes

slevang merged commit bc7be5b into main Sep 24, 2024
11 checks passed

BSchilperoort deleted the sparse-weights branch September 24, 2024 15:48

This was referenced Sep 24, 2024

Implement new "most common" regridder. #46

Merged

Improve performance of conservative routine #42

Closed

Add dask-related tests to test suite #44

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sparse weights in conservative method #49

Sparse weights in conservative method #49

slevang commented Sep 20, 2024

BSchilperoort commented Sep 20, 2024 •

edited

Loading

slevang commented Sep 21, 2024

slevang commented Sep 21, 2024

BSchilperoort left a comment

Sparse weights in conservative method #49

Sparse weights in conservative method #49

Conversation

slevang commented Sep 20, 2024

BSchilperoort commented Sep 20, 2024 • edited Loading

slevang commented Sep 21, 2024

slevang commented Sep 21, 2024

BSchilperoort left a comment

Choose a reason for hiding this comment

BSchilperoort commented Sep 20, 2024 •

edited

Loading