Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding explicit LU factorization within CRAM #2703

Open
wants to merge 2 commits into
base: develop
Choose a base branch
from

Conversation

eepeterson
Copy link
Contributor

@eepeterson eepeterson commented Sep 22, 2023

Description

Another performance (and accuracy?) improvement to depletion. For some reason doing an explicit LU factorization before solving within the CRAM method seems to result in far fewer warnings about negative density nuclides and gives a ~10% speed up in the cases I have tried. The testing isn't exhaustive so I'd be curious to find out what others observe. I have attached files below showing the output during the depletion calculation and profile information for my particular example problem. I don't yet fully understand these differences since it gets into the scipy and SuperLU internals and may ultimately end up being very machine dependent because of this, but I'll keep digging and if anybody has any insight, feel free to chime in.

no_patch.txt
with_patch.txt

Update after merging #2764:
no_patch.txt
with_patch.txt

There is almost a 2x speed up for problems with many repeated timesteps with the same source rate

Checklist

  • I have performed a self-review of my own code
    - [ ] I have run clang-format on any C++ source files (if applicable)
  • I have followed the style guidelines for Python source files (if applicable)
  • I have made corresponding changes to the documentation (if applicable)
    - [ ] I have added tests that prove my fix is effective or that my feature works (if applicable)

Comment on lines 71 to 73
direct_solve : bool
Whether to use scipy.linalg.spsolve directly. Default is False
which performs an explicit LU factorization.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If LU factorization is better, why not use that all the time?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry , that came out before the rest of the review. So there's some context missing in my comment.

But it also doesn't appear that this is configurable by the user. We pass n_result = func(matrix, n_multi, dt, direct_solve=True) later on. So how would a user be able to configure this setting?

From the Zen of Python / PEP 20

There should be one-- and preferably only one --obvious way to do it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. The only reason I added this was to go back to the original implementation if transfer_rates is set. In an ideal world we would just do the explicit factorization, but it doesn't seem to be working correctly when transfer rates are used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Part of me wonders if CRAM is not the correct method for solving the coupled block matrices in the transfer rates case, but if anybody has insight on that it would be great.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi @eepeterson, this seems to be a good improvement!

Just curious to know what kind of errors did you notice when using LU with transfer_rates? I've also spent a bit of time recently trying to think of alternative methods we could used to solve the block matrix when many transfer rates are set for speeding it up, and LU factorization came also out.

Note also that the block matrix is only constructed when a destination_material is set in the add_transfer_rate method, otherwise it's the same depletion matrix as before with only a removal term added. Did you get strange results with or without a destination_material set?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for chiming in @church89. I'm seeing strange results primarily in the assert_reaction_rates_equal portion of the deplete_with_transfer_rates regression tests. The most finicky test is the final one in the suite "depletion_with_transfer" so the destination_material is set. However, sometimes other ones fail as well. A couple concerns I have are below:

  1. Adding external feed and removal seems to introduce eigenvalues with positive real parts in the burnup matrix which is not where CRAM's approximation is accurate. They may be small enough with low feed rates and short time steps to not make a big difference, but it's still a concern. Until I can dig into the theory a bit more it seems like CRAM is ill-suited for this block matrix coupled approach for transfer rates. By contrast, the eigenvalues of every burnup matrix evaluated in the deplete_with_transport regression test suite all have negative real parts.
  2. Despite the theoretical justification that CRAM shouldn't be used for these problems, the resulting nuclide densities coming out of CRAM after the first timestep agree between develop and this branch to within a relative tolerance of 1e-14. It's when they get put back in to the simulation with openmc.lib and resulting reaction rates are calculated that they disagree with the reference results.
  3. The tolerances on the nuclide concentrations and reaction rates seem very small for simulations with 500 particles and 2 batches. By contrast, the tests in the deplete_no_transport case have atom tolerances that are on the order of 1e-3 and reaction rate tolerances on the order of 1e-2 or 1e-1 almost. Right now the tolerances in deplete_with_transfer_rates are 1e-4 and 1e-5 respectively. At the very least it makes sense to me that the reaction rate tolerances should be larger than the atom tolerances. Were there justifications for the tolerances in this test suite or could they be changed to be more in line with the other suites?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eepeterson, for the transfer_rates case without destination_material set, I would have excepted not to give any issues, independently on the transfer rates values set, as it should simply add a proportional term to the diagonal terms of the burnup matrix and no block matrix is built.

On the other hand, when the destination materials are set, the coupled block matrix becomes very sparse and might introduce some instability in the CRAM, but apart from a bit more negative nuclide densities results made sense.

There is no particular reason for the tolerances used, so if it helps pass some more tests and be in line with other suites we can change them.

Copy link
Contributor

@drewejohnson drewejohnson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've had this in the back of my head for a bit, and just haven't had time to get around to it.

The first matter at hand is the current test suite differs. Looking at one of the failed tests, the reaction rates are different enough to fail a check, but not the number densities

         assert_atoms_equal(res_ref, res_test, 1e-4)
>       assert_reaction_rates_equal(res_ref, res_test)
...
>                       assert y_test == pytest.approx(y_ref, rel=tol), \
                            f'Reaction rate not equal for material {mat}, nuclide '\
                            f'{nuc}, {rx}\ny_ref={y_ref}\ny_test={y_test}'
E                       AssertionError: Reaction rate not equal for material 1, nuclide Cs135, (n,gamma)
E                       y_ref=[      0.         3007214.20178012]
E                       y_test=[      0.         3009864.74338142]

It would make sense that number densities are different, perhaps slightly, because of differences in the solver. And those could be small enough to pass a number density check. But then those small density differences propagate to reaction rate differences, and in this case are not small enough to pass the check.

Now, the question becomes does this change make the results more correct, and thus the failure is because the reference files need to be updated?

From your message, this appears to have a 10% speed up. Which is excellent! So that's a thing to consider here.

You also said there are fewer negative number density warnings. Which is also good. CRAM does not promise all densities are non-negative, but having fewer is better. It would be great if you could add a test that shows negative densities w/ the current implementation, and fewer or no negative densites with superlu.

I'm inclined to be in favor of this, but it's not clear we've made the depletion more accurate according to the test suite here. And if SuperLU gives us better accuracy and performance, we should use that in all cases. If I'm a user working on a depletion model, and there are two linear solver methods for depletion, how do I know what to use and why?

@eepeterson
Copy link
Contributor Author

eepeterson commented Nov 4, 2023

Thanks for taking a look @drewejohnson! All good points.

Now, the question becomes does this change make the results more correct, and thus the failure is because the reference files need to be updated?

I have tried updating the reference results and they still don't pass in CI, which I have struggled to figure out why and wonder if it has to do with how scipy is built and/or linked on the CI servers vs my own machine.

It would be great if you could add a test that shows negative densities w/ the current implementation, and fewer or no negative densites with superlu.

Anecdotally you can see the results in the output of the two text files I posted. The no_patch version has a number of negative density warnings and the with_patch version does not. The code as written doesn't give the user the ability to run depletion calculations with the current implementation and the new implementation so a formal test case isn't possible (if that's what you were suggesting).

If I'm a user working on a depletion model, and there are two linear solver methods for depletion, how do I know what to use and why?

The user never really interacts with the CRAM method specifically and I would like to get rid of this kwarg anyway, but I can't figure out how to get these tests to pass both locally and in CI without this logic to revert to the direct solve when transfer_rates is set on the Integrator.

@eepeterson
Copy link
Contributor Author

FWIW the tests in tests/regression_tests/deplete_with_transfer_rates/ do pass locally for me on this branch since adding the direct_solve kwarg to keep the behavior the same as before when transfer_rates is set.

@eepeterson
Copy link
Contributor Author

I have updated the profile results for this patch (difference between current develop and this branch) since some of the other improvements have been made. This is the same depletion problem as original posted at the top of this PR. There is still a ~10% speed up and fewer negative density warnings as expected.

no_patch.txt
with_patch.txt

I don't love the extra logic in the CRAM method here, but I think this is the cleanest approach for now unless we would prefer to create a separate solver class for use when transfer_rates is set. @paulromano and @drewejohnson what do you think?

@eepeterson eepeterson marked this pull request as draft November 8, 2023 22:10
@paulromano paulromano marked this pull request as ready for review November 9, 2023 13:22
Copy link
Contributor

@paulromano paulromano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy with this as-is. As @eepeterson and I have discussed offline, it's not immediately obvious that the assumptions made in the CRAM method are satisfied when trying to solve the block system that arises from the use of transfer rates, which may be why we're seeing unusual behavior there. For now, it seems prudent to keep the current method for the transfer rates functionality while switching to the explicit LU factorization for regular depletion runs.

@drewejohnson are you satisfied with the current approach?

@drewejohnson
Copy link
Contributor

I'm in favor of the LU factorization because of the alleged performance boost. Feels great.

it's not immediately obvious that the assumptions made in the CRAM method are satisfied when trying to solve the block system that arises from the use of transfer rates

Yeah this feels unrelated to using csc / csr / lu factorization thing at hand: Is CRAM appropriate w/ transfer rates. I wasn't around for the transfer rates discussion so I'm going to assume its related to moving fuel or depleting systems where an external actor is supplying / removing nuclides. I keep going back to this section from Maria Pusa's paper on CRAM

However, it was recently discovered by the author that the eigenvalues of the burnup matrix are generally confined to a region near the negative real axis.1 This observation led to applying the Chebyshev Rational Approximation Method to solve the burnup equations. This method can be interpreted as the best rational approximation on the negative real axis

So if you're using transfer rates as a way to scrub fuel, like off gassing fission products, then the diagonal of the matrix will remain negative. Probably okay. But, if you include enough addition of material such that the you have a positive rate of change, CRAM may not be valid.

I personally don't like the inclusion of the conditionals in the CRAM solve. Is the rationale behind not using LU for transfer rate problems we can't get them to pass on CI? Is there a way to generate the reference data on the CI machine and pull those? Otherwise we're at the mercy of machine differences, library versions, compilers, all those fun things python hides from us 🙂

@eepeterson
Copy link
Contributor Author

eepeterson commented Nov 10, 2023

I personally don't like the inclusion of the conditionals in the CRAM solve. Is the rationale behind not using LU for transfer rate problems we can't get them to pass on CI? Is there a way to generate the reference data on the CI machine and pull those?

See PR #2764 @drewejohnson I think I'm going to table this one until that one is sorted out. I think once we understand the issues in the tests with that PR then this one will sort itself out. In that PR I didn't change the solver, just the matrix format and the tests still didn't pass. I had to loosen the tolerance significantly for them to pass. I would like to still find out the fundamental cause, but I think it's unrelated to the splu object as evidenced in #2764.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants