move computation of wet_fac to post_init #22

NoraLoose · 2021-02-10T18:29:00Z

This PR does the following things:

In CartesianLaplacianWithLandMask

move the computation of fac (now: wet_fac) to pre_init. This factor is independent of the field which the Laplacian is applied to. If we apply the Laplacian many times (as we will for the filtering operation), computing the factor only once will speed things up.
delete 1 unneccesary line: out = field.copy()

Following @rabernat's recommendation in #21 (comment), I also disabled the mypy check.

rabernat · 2021-02-10T20:36:37Z

This is great, thanks for the update! Two questions.

Do you know if this is actually faster? Have you tested that? Just curious...
Have you verified that it works on GPU? That's something our test suite currently can't do, since we don't have a GPU inside the github workflow

NoraLoose · 2021-02-10T23:43:32Z

Thanks for looking at this!

Have you verified that it works on GPU? That's something our test suite currently can't do, since we don't have a GPU inside the github workflow

I just tested this. Yes, it works on GPU.

Do you know if this is actually faster? Have you tested that? Just curious...

The answer is: I just assumed it would be faster. Turns out it takes pretty much the same time, see this notebook.

The good news is that gcm_filters is much faster on GPU than on CPU (as we saw before).
For the example in the notebook:
CPU 3min 27.0s
GPU 14.5s

Since this new implementation is not faster, we can just disregard this PR. Sorry for the noise!

NoraLoose · 2021-02-11T00:30:30Z

Having a closer look at the code in filter.py, I now understand why this PR doesn't lead to a large speedup.

In my example, I am filtering 1600 2D fields (100 time slices, 16 layers). For each of the 1600 fields, the Laplacian is applied 8 times (since I chose n_steps = 8). I initially thought that with my PR, wet_fac would be computed only once, instead of 1600 * 8 times. But this is not the case. With the following code structure, post_init() for the Laplacian (and the computation of wet_fac) is called 1600 times.

gcm-filters/gcm_filters/filter.py

Lines 245 to 253 in 73baa9e

    
           field_smooth = xr.apply_ufunc( 
        
               filter_func, 
        
               field, 
        
               *grid_args, 
        
               input_core_dims=n_args * [dims], 
        
               output_core_dims=[dims], 
        
               output_dtypes=[field.dtype], 
        
               dask="parallelized", 
        
           )

In summary, while I expected to reduce 1600 * 8 computations to 1 computation, my PR led only to a reduction from 1600 * 8 to 1600 * 1 computations. Since you typically do not choose very high values for n_steps (you will actually run into numerical instabilities for large n_steps), I suggest that we just stick to the current implementation and close this PR.

rabernat · 2021-02-11T02:12:45Z

With the following code structure, post_init() for the Laplacian (and the computation of wet_fac) is called 1600 times.

Thanks for digging into this. It is extremely useful information. Did you run profiling to get it? If so, can you post the profile report here?

I think we should keep this PR and think of a way to refactor the kernel so that the initialization only has to be done once. (That can be separate PR and should probably be done by me.)

NoraLoose · 2021-02-11T03:52:28Z

Thanks for digging into this. It is extremely useful information. Did you run profiling to get it? If so, can you post the profile report here?

I haven't done any rigorous profiling, but saw no significant difference in the execution time. That, together with looking at the code structure, led me to this conclusion. But I could be wrong.

I think I will prioritize the implementation of the Laplacian kernels for some of the gcm grids in the near future, but could get back to this performance question at a later time.

rabernat · 2021-02-11T12:58:02Z

That, together with looking at the code structure, led me to this conclusion

I think you are right. The kernel is initialized inside the function passed to apply_ufunc:

gcm-filters/gcm_filters/filter.py

Lines 149 to 156 in 8b92035

    
           def filter_func(field, *args): 
        
               # these next steps are a kind of hack we have to turn keyword arugments into regular arguments 
        
               # the reason for doing this is that Xarray's apply_ufunc machinery works a lot better 
        
               # with regular arguments 
        
               assert len(args) == len(Laplacian.required_grid_args()) 
        
               grid_vars = {k: v for k, v in zip(Laplacian.required_grid_args(), args)} 
        
               laplacian = Laplacian(**grid_vars) 
        
               np = get_array_module(field)

We should be able to move the initialization (Laplacian()) outside.

move computation of wet_fac to post_init

e1427c3

rabernat mentioned this pull request Feb 10, 2021

Issue with pre-commit or mypy #21

Closed

rabernat merged commit 8b92035 into ocean-eddy-cpt:master Feb 11, 2021

NoraLoose deleted the preinit-wetfac branch February 11, 2021 16:06

rabernat mentioned this pull request Feb 17, 2021

section on computational aspects ocean-eddy-cpt/gcm-filters-paper#4

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

move computation of wet_fac to post_init #22

move computation of wet_fac to post_init #22

NoraLoose commented Feb 10, 2021

rabernat commented Feb 10, 2021

NoraLoose commented Feb 10, 2021

NoraLoose commented Feb 11, 2021

rabernat commented Feb 11, 2021 •

edited

Loading

NoraLoose commented Feb 11, 2021

rabernat commented Feb 11, 2021 •

edited

Loading

move computation of wet_fac to post_init #22

move computation of wet_fac to post_init #22

Conversation

NoraLoose commented Feb 10, 2021

rabernat commented Feb 10, 2021

NoraLoose commented Feb 10, 2021

NoraLoose commented Feb 11, 2021

rabernat commented Feb 11, 2021 • edited Loading

NoraLoose commented Feb 11, 2021

rabernat commented Feb 11, 2021 • edited Loading

rabernat commented Feb 11, 2021 •

edited

Loading

rabernat commented Feb 11, 2021 •

edited

Loading