-
Notifications
You must be signed in to change notification settings - Fork 500
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding explicit LU factorization within CRAM #2703
base: develop
Are you sure you want to change the base?
Conversation
64f8d2f
to
e099ed7
Compare
e099ed7
to
59b9132
Compare
openmc/deplete/cram.py
Outdated
direct_solve : bool | ||
Whether to use scipy.linalg.spsolve directly. Default is False | ||
which performs an explicit LU factorization. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If LU factorization is better, why not use that all the time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry , that came out before the rest of the review. So there's some context missing in my comment.
But it also doesn't appear that this is configurable by the user. We pass n_result = func(matrix, n_multi, dt, direct_solve=True)
later on. So how would a user be able to configure this setting?
From the Zen of Python / PEP 20
There should be one-- and preferably only one --obvious way to do it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. The only reason I added this was to go back to the original implementation if transfer_rates
is set. In an ideal world we would just do the explicit factorization, but it doesn't seem to be working correctly when transfer rates are used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Part of me wonders if CRAM is not the correct method for solving the coupled block matrices in the transfer rates case, but if anybody has insight on that it would be great.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hi @eepeterson, this seems to be a good improvement!
Just curious to know what kind of errors did you notice when using LU with transfer_rates? I've also spent a bit of time recently trying to think of alternative methods we could used to solve the block matrix when many transfer rates are set for speeding it up, and LU factorization came also out.
Note also that the block matrix is only constructed when a destination_material is set in the add_transfer_rate method, otherwise it's the same depletion matrix as before with only a removal term added. Did you get strange results with or without a destination_material set?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for chiming in @church89. I'm seeing strange results primarily in the assert_reaction_rates_equal
portion of the deplete_with_transfer_rates
regression tests. The most finicky test is the final one in the suite "depletion_with_transfer"
so the destination_material
is set. However, sometimes other ones fail as well. A couple concerns I have are below:
- Adding external feed and removal seems to introduce eigenvalues with positive real parts in the burnup matrix which is not where CRAM's approximation is accurate. They may be small enough with low feed rates and short time steps to not make a big difference, but it's still a concern. Until I can dig into the theory a bit more it seems like CRAM is ill-suited for this block matrix coupled approach for transfer rates. By contrast, the eigenvalues of every burnup matrix evaluated in the
deplete_with_transport
regression test suite all have negative real parts. - Despite the theoretical justification that CRAM shouldn't be used for these problems, the resulting nuclide densities coming out of CRAM after the first timestep agree between develop and this branch to within a relative tolerance of 1e-14. It's when they get put back in to the simulation with
openmc.lib
and resulting reaction rates are calculated that they disagree with the reference results. - The tolerances on the nuclide concentrations and reaction rates seem very small for simulations with 500 particles and 2 batches. By contrast, the tests in the
deplete_no_transport
case have atom tolerances that are on the order of 1e-3 and reaction rate tolerances on the order of 1e-2 or 1e-1 almost. Right now the tolerances indeplete_with_transfer_rates
are 1e-4 and 1e-5 respectively. At the very least it makes sense to me that the reaction rate tolerances should be larger than the atom tolerances. Were there justifications for the tolerances in this test suite or could they be changed to be more in line with the other suites?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@eepeterson, for the transfer_rates case without destination_material set, I would have excepted not to give any issues, independently on the transfer rates values set, as it should simply add a proportional term to the diagonal terms of the burnup matrix and no block matrix is built.
On the other hand, when the destination materials are set, the coupled block matrix becomes very sparse and might introduce some instability in the CRAM, but apart from a bit more negative nuclide densities results made sense.
There is no particular reason for the tolerances used, so if it helps pass some more tests and be in line with other suites we can change them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've had this in the back of my head for a bit, and just haven't had time to get around to it.
The first matter at hand is the current test suite differs. Looking at one of the failed tests, the reaction rates are different enough to fail a check, but not the number densities
assert_atoms_equal(res_ref, res_test, 1e-4)
> assert_reaction_rates_equal(res_ref, res_test)
...
> assert y_test == pytest.approx(y_ref, rel=tol), \
f'Reaction rate not equal for material {mat}, nuclide '\
f'{nuc}, {rx}\ny_ref={y_ref}\ny_test={y_test}'
E AssertionError: Reaction rate not equal for material 1, nuclide Cs135, (n,gamma)
E y_ref=[ 0. 3007214.20178012]
E y_test=[ 0. 3009864.74338142]
It would make sense that number densities are different, perhaps slightly, because of differences in the solver. And those could be small enough to pass a number density check. But then those small density differences propagate to reaction rate differences, and in this case are not small enough to pass the check.
Now, the question becomes does this change make the results more correct, and thus the failure is because the reference files need to be updated?
From your message, this appears to have a 10% speed up. Which is excellent! So that's a thing to consider here.
You also said there are fewer negative number density warnings. Which is also good. CRAM does not promise all densities are non-negative, but having fewer is better. It would be great if you could add a test that shows negative densities w/ the current implementation, and fewer or no negative densites with superlu.
I'm inclined to be in favor of this, but it's not clear we've made the depletion more accurate according to the test suite here. And if SuperLU gives us better accuracy and performance, we should use that in all cases. If I'm a user working on a depletion model, and there are two linear solver methods for depletion, how do I know what to use and why?
Thanks for taking a look @drewejohnson! All good points.
I have tried updating the reference results and they still don't pass in CI, which I have struggled to figure out why and wonder if it has to do with how scipy is built and/or linked on the CI servers vs my own machine.
Anecdotally you can see the results in the output of the two text files I posted. The no_patch version has a number of negative density warnings and the with_patch version does not. The code as written doesn't give the user the ability to run depletion calculations with the current implementation and the new implementation so a formal test case isn't possible (if that's what you were suggesting).
The user never really interacts with the CRAM method specifically and I would like to get rid of this kwarg anyway, but I can't figure out how to get these tests to pass both locally and in CI without this logic to revert to the direct solve when |
FWIW the tests in |
c25aa93
to
e381036
Compare
I have updated the profile results for this patch (difference between current develop and this branch) since some of the other improvements have been made. This is the same depletion problem as original posted at the top of this PR. There is still a ~10% speed up and fewer negative density warnings as expected. I don't love the extra logic in the CRAM method here, but I think this is the cleanest approach for now unless we would prefer to create a separate solver class for use when |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy with this as-is. As @eepeterson and I have discussed offline, it's not immediately obvious that the assumptions made in the CRAM method are satisfied when trying to solve the block system that arises from the use of transfer rates, which may be why we're seeing unusual behavior there. For now, it seems prudent to keep the current method for the transfer rates functionality while switching to the explicit LU factorization for regular depletion runs.
@drewejohnson are you satisfied with the current approach?
I'm in favor of the LU factorization because of the alleged performance boost. Feels great.
Yeah this feels unrelated to using csc / csr / lu factorization thing at hand: Is CRAM appropriate w/ transfer rates. I wasn't around for the transfer rates discussion so I'm going to assume its related to moving fuel or depleting systems where an external actor is supplying / removing nuclides. I keep going back to this section from Maria Pusa's paper on CRAM
So if you're using transfer rates as a way to scrub fuel, like off gassing fission products, then the diagonal of the matrix will remain negative. Probably okay. But, if you include enough addition of material such that the you have a positive rate of change, CRAM may not be valid. I personally don't like the inclusion of the conditionals in the CRAM solve. Is the rationale behind not using LU for transfer rate problems we can't get them to pass on CI? Is there a way to generate the reference data on the CI machine and pull those? Otherwise we're at the mercy of machine differences, library versions, compilers, all those fun things python hides from us 🙂 |
See PR #2764 @drewejohnson I think I'm going to table this one until that one is sorted out. I think once we understand the issues in the tests with that PR then this one will sort itself out. In that PR I didn't change the solver, just the matrix format and the tests still didn't pass. I had to loosen the tolerance significantly for them to pass. I would like to still find out the fundamental cause, but I think it's unrelated to the |
e381036
to
680070e
Compare
680070e
to
e0063e9
Compare
Description
Another performance (and accuracy?) improvement to depletion. For some reason doing an explicit LU factorization before solving within the CRAM method seems to result in far fewer warnings about negative density nuclides and gives a ~10% speed up in the cases I have tried. The testing isn't exhaustive so I'd be curious to find out what others observe. I have attached files below showing the output during the depletion calculation and profile information for my particular example problem. I don't yet fully understand these differences since it gets into the scipy and SuperLU internals and may ultimately end up being very machine dependent because of this, but I'll keep digging and if anybody has any insight, feel free to chime in.
no_patch.txt
with_patch.txt
Update after merging #2764:
no_patch.txt
with_patch.txt
There is almost a 2x speed up for problems with many repeated timesteps with the same source rate
Checklist
- [ ] I have run clang-format on any C++ source files (if applicable)- [ ] I have added tests that prove my fix is effective or that my feature works (if applicable)