Adding explicit LU factorization within CRAM #2703

eepeterson · 2023-09-22T18:55:15Z

Description

Another performance (and accuracy?) improvement to depletion. For some reason doing an explicit LU factorization before solving within the CRAM method seems to result in far fewer warnings about negative density nuclides and gives a ~10% speed up in the cases I have tried. The testing isn't exhaustive so I'd be curious to find out what others observe. I have attached files below showing the output during the depletion calculation and profile information for my particular example problem. I don't yet fully understand these differences since it gets into the scipy and SuperLU internals and may ultimately end up being very machine dependent because of this, but I'll keep digging and if anybody has any insight, feel free to chime in.

no_patch.txt
with_patch.txt

Update after merging #2764:
no_patch.txt
with_patch.txt

There is almost a 2x speed up for problems with many repeated timesteps with the same source rate

Checklist

I have performed a self-review of my own code
~~- [ ] I have run clang-format on any C++ source files (if applicable)~~
I have followed the style guidelines for Python source files (if applicable)
I have made corresponding changes to the documentation (if applicable)
~~- [ ] I have added tests that prove my fix is effective or that my feature works (if applicable)~~

drewejohnson · 2023-11-04T19:45:07Z

openmc/deplete/cram.py

+        direct_solve : bool
+            Whether to use scipy.linalg.spsolve directly. Default is False
+            which performs an explicit LU factorization.


If LU factorization is better, why not use that all the time?

Sorry , that came out before the rest of the review. So there's some context missing in my comment.

But it also doesn't appear that this is configurable by the user. We pass n_result = func(matrix, n_multi, dt, direct_solve=True) later on. So how would a user be able to configure this setting?

From the Zen of Python / PEP 20

There should be one-- and preferably only one --obvious way to do it.

I agree. The only reason I added this was to go back to the original implementation if transfer_rates is set. In an ideal world we would just do the explicit factorization, but it doesn't seem to be working correctly when transfer rates are used.

Part of me wonders if CRAM is not the correct method for solving the coupled block matrices in the transfer rates case, but if anybody has insight on that it would be great.

hi @eepeterson, this seems to be a good improvement!

Just curious to know what kind of errors did you notice when using LU with transfer_rates? I've also spent a bit of time recently trying to think of alternative methods we could used to solve the block matrix when many transfer rates are set for speeding it up, and LU factorization came also out.

Note also that the block matrix is only constructed when a destination_material is set in the add_transfer_rate method, otherwise it's the same depletion matrix as before with only a removal term added. Did you get strange results with or without a destination_material set?

Thanks for chiming in @church89. I'm seeing strange results primarily in the assert_reaction_rates_equal portion of the deplete_with_transfer_rates regression tests. The most finicky test is the final one in the suite "depletion_with_transfer" so the destination_material is set. However, sometimes other ones fail as well. A couple concerns I have are below:

Adding external feed and removal seems to introduce eigenvalues with positive real parts in the burnup matrix which is not where CRAM's approximation is accurate. They may be small enough with low feed rates and short time steps to not make a big difference, but it's still a concern. Until I can dig into the theory a bit more it seems like CRAM is ill-suited for this block matrix coupled approach for transfer rates. By contrast, the eigenvalues of every burnup matrix evaluated in the deplete_with_transport regression test suite all have negative real parts.

Despite the theoretical justification that CRAM shouldn't be used for these problems, the resulting nuclide densities coming out of CRAM after the first timestep agree between develop and this branch to within a relative tolerance of 1e-14. It's when they get put back in to the simulation with openmc.lib and resulting reaction rates are calculated that they disagree with the reference results.

The tolerances on the nuclide concentrations and reaction rates seem very small for simulations with 500 particles and 2 batches. By contrast, the tests in the deplete_no_transport case have atom tolerances that are on the order of 1e-3 and reaction rate tolerances on the order of 1e-2 or 1e-1 almost. Right now the tolerances in deplete_with_transfer_rates are 1e-4 and 1e-5 respectively. At the very least it makes sense to me that the reaction rate tolerances should be larger than the atom tolerances. Were there justifications for the tolerances in this test suite or could they be changed to be more in line with the other suites?

@eepeterson, for the transfer_rates case without destination_material set, I would have excepted not to give any issues, independently on the transfer rates values set, as it should simply add a proportional term to the diagonal terms of the burnup matrix and no block matrix is built.

On the other hand, when the destination materials are set, the coupled block matrix becomes very sparse and might introduce some instability in the CRAM, but apart from a bit more negative nuclide densities results made sense.

There is no particular reason for the tolerances used, so if it helps pass some more tests and be in line with other suites we can change them.

drewejohnson

I've had this in the back of my head for a bit, and just haven't had time to get around to it.

The first matter at hand is the current test suite differs. Looking at one of the failed tests, the reaction rates are different enough to fail a check, but not the number densities

         assert_atoms_equal(res_ref, res_test, 1e-4)
>       assert_reaction_rates_equal(res_ref, res_test)
...
>                       assert y_test == pytest.approx(y_ref, rel=tol), \
                            f'Reaction rate not equal for material {mat}, nuclide '\
                            f'{nuc}, {rx}\ny_ref={y_ref}\ny_test={y_test}'
E                       AssertionError: Reaction rate not equal for material 1, nuclide Cs135, (n,gamma)
E                       y_ref=[      0.         3007214.20178012]
E                       y_test=[      0.         3009864.74338142]

It would make sense that number densities are different, perhaps slightly, because of differences in the solver. And those could be small enough to pass a number density check. But then those small density differences propagate to reaction rate differences, and in this case are not small enough to pass the check.

Now, the question becomes does this change make the results more correct, and thus the failure is because the reference files need to be updated?

From your message, this appears to have a 10% speed up. Which is excellent! So that's a thing to consider here.

You also said there are fewer negative number density warnings. Which is also good. CRAM does not promise all densities are non-negative, but having fewer is better. It would be great if you could add a test that shows negative densities w/ the current implementation, and fewer or no negative densites with superlu.

I'm inclined to be in favor of this, but it's not clear we've made the depletion more accurate according to the test suite here. And if SuperLU gives us better accuracy and performance, we should use that in all cases. If I'm a user working on a depletion model, and there are two linear solver methods for depletion, how do I know what to use and why?

eepeterson · 2023-11-04T21:00:30Z

Thanks for taking a look @drewejohnson! All good points.

Now, the question becomes does this change make the results more correct, and thus the failure is because the reference files need to be updated?

I have tried updating the reference results and they still don't pass in CI, which I have struggled to figure out why and wonder if it has to do with how scipy is built and/or linked on the CI servers vs my own machine.

It would be great if you could add a test that shows negative densities w/ the current implementation, and fewer or no negative densites with superlu.

Anecdotally you can see the results in the output of the two text files I posted. The no_patch version has a number of negative density warnings and the with_patch version does not. The code as written doesn't give the user the ability to run depletion calculations with the current implementation and the new implementation so a formal test case isn't possible (if that's what you were suggesting).

If I'm a user working on a depletion model, and there are two linear solver methods for depletion, how do I know what to use and why?

The user never really interacts with the CRAM method specifically and I would like to get rid of this kwarg anyway, but I can't figure out how to get these tests to pass both locally and in CI without this logic to revert to the direct solve when transfer_rates is set on the Integrator.

eepeterson · 2023-11-05T02:46:24Z

FWIW the tests in tests/regression_tests/deplete_with_transfer_rates/ do pass locally for me on this branch since adding the direct_solve kwarg to keep the behavior the same as before when transfer_rates is set.

eepeterson · 2023-11-07T20:21:02Z

I have updated the profile results for this patch (difference between current develop and this branch) since some of the other improvements have been made. This is the same depletion problem as original posted at the top of this PR. There is still a ~10% speed up and fewer negative density warnings as expected.

no_patch.txt
with_patch.txt

I don't love the extra logic in the CRAM method here, but I think this is the cleanest approach for now unless we would prefer to create a separate solver class for use when transfer_rates is set. @paulromano and @drewejohnson what do you think?

paulromano

I'm happy with this as-is. As @eepeterson and I have discussed offline, it's not immediately obvious that the assumptions made in the CRAM method are satisfied when trying to solve the block system that arises from the use of transfer rates, which may be why we're seeing unusual behavior there. For now, it seems prudent to keep the current method for the transfer rates functionality while switching to the explicit LU factorization for regular depletion runs.

@drewejohnson are you satisfied with the current approach?

drewejohnson · 2023-11-10T22:59:59Z

I'm in favor of the LU factorization because of the alleged performance boost. Feels great.

it's not immediately obvious that the assumptions made in the CRAM method are satisfied when trying to solve the block system that arises from the use of transfer rates

Yeah this feels unrelated to using csc / csr / lu factorization thing at hand: Is CRAM appropriate w/ transfer rates. I wasn't around for the transfer rates discussion so I'm going to assume its related to moving fuel or depleting systems where an external actor is supplying / removing nuclides. I keep going back to this section from Maria Pusa's paper on CRAM

However, it was recently discovered by the author that the eigenvalues of the burnup matrix are generally confined to a region near the negative real axis.1 This observation led to applying the Chebyshev Rational Approximation Method to solve the burnup equations. This method can be interpreted as the best rational approximation on the negative real axis

So if you're using transfer rates as a way to scrub fuel, like off gassing fission products, then the diagonal of the matrix will remain negative. Probably okay. But, if you include enough addition of material such that the you have a positive rate of change, CRAM may not be valid.

I personally don't like the inclusion of the conditionals in the CRAM solve. Is the rationale behind not using LU for transfer rate problems we can't get them to pass on CI? Is there a way to generate the reference data on the CI machine and pull those? Otherwise we're at the mercy of machine differences, library versions, compilers, all those fun things python hides from us 🙂

eepeterson · 2023-11-10T23:15:39Z

I personally don't like the inclusion of the conditionals in the CRAM solve. Is the rationale behind not using LU for transfer rate problems we can't get them to pass on CI? Is there a way to generate the reference data on the CI machine and pull those?

See PR #2764 @drewejohnson I think I'm going to table this one until that one is sorted out. I think once we understand the issues in the tests with that PR then this one will sort itself out. In that PR I didn't change the solver, just the matrix format and the tests still didn't pass. I had to loosen the tolerance significantly for them to pass. I would like to still find out the fundamental cause, but I think it's unrelated to the splu object as evidenced in #2764.

eepeterson requested a review from drewejohnson as a code owner September 22, 2023 18:55

eepeterson force-pushed the add_splu_factorization branch from 64f8d2f to e099ed7 Compare September 29, 2023 21:00

eepeterson force-pushed the add_splu_factorization branch from e099ed7 to 59b9132 Compare October 17, 2023 16:36

drewejohnson reviewed Nov 4, 2023

View reviewed changes

drewejohnson requested changes Nov 4, 2023

View reviewed changes

eepeterson force-pushed the add_splu_factorization branch from c25aa93 to e381036 Compare November 5, 2023 03:17

eepeterson marked this pull request as draft November 8, 2023 22:10

eepeterson mentioned this pull request Nov 9, 2023

Change matrix format to CSC in depletion #2764

Merged

3 tasks

paulromano marked this pull request as ready for review November 9, 2023 13:22

paulromano approved these changes Nov 9, 2023

View reviewed changes

This was referenced Nov 11, 2023

change identity matrix format to csr in cram #2771

Merged

Modifications to deplete_with_transfer_rates regression test suite #2779

Merged

eepeterson force-pushed the add_splu_factorization branch from e381036 to 680070e Compare November 27, 2023 21:52

eepeterson added 2 commits November 28, 2023 10:00

change solve to splu then solve

7dbd5e5

add splu cache to depletion solver

e0063e9

eepeterson force-pushed the add_splu_factorization branch from 680070e to e0063e9 Compare November 28, 2023 16:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding explicit LU factorization within CRAM #2703

Adding explicit LU factorization within CRAM #2703

eepeterson commented Sep 22, 2023 •

edited

Loading

drewejohnson Nov 4, 2023

drewejohnson Nov 4, 2023

eepeterson Nov 4, 2023

eepeterson Nov 4, 2023

church89 Nov 9, 2023

eepeterson Nov 9, 2023

church89 Nov 10, 2023

drewejohnson left a comment

eepeterson commented Nov 4, 2023 •

edited

Loading

eepeterson commented Nov 5, 2023

eepeterson commented Nov 7, 2023

paulromano left a comment

drewejohnson commented Nov 10, 2023

eepeterson commented Nov 10, 2023 •

edited

Loading

Adding explicit LU factorization within CRAM #2703

Are you sure you want to change the base?

Adding explicit LU factorization within CRAM #2703

Conversation

eepeterson commented Sep 22, 2023 • edited Loading

Description

Checklist

drewejohnson Nov 4, 2023

Choose a reason for hiding this comment

drewejohnson Nov 4, 2023

Choose a reason for hiding this comment

eepeterson Nov 4, 2023

Choose a reason for hiding this comment

eepeterson Nov 4, 2023

Choose a reason for hiding this comment

church89 Nov 9, 2023

Choose a reason for hiding this comment

eepeterson Nov 9, 2023

Choose a reason for hiding this comment

church89 Nov 10, 2023

Choose a reason for hiding this comment

drewejohnson left a comment

Choose a reason for hiding this comment

eepeterson commented Nov 4, 2023 • edited Loading

eepeterson commented Nov 5, 2023

eepeterson commented Nov 7, 2023

paulromano left a comment

Choose a reason for hiding this comment

drewejohnson commented Nov 10, 2023

eepeterson commented Nov 10, 2023 • edited Loading

eepeterson commented Sep 22, 2023 •

edited

Loading

eepeterson commented Nov 4, 2023 •

edited

Loading

eepeterson commented Nov 10, 2023 •

edited

Loading