-
Notifications
You must be signed in to change notification settings - Fork 72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Addition of CuPy as an Accelerated Computing Option #499
Conversation
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## develop #499 +/- ##
===========================================
- Coverage 74.74% 73.97% -0.78%
===========================================
Files 18 18
Lines 6502 6612 +110
===========================================
+ Hits 4860 4891 +31
- Misses 1642 1721 +79
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report in Codecov by Sentry. |
Pulling changes made to update POPPY support to 3.10
…nor fixed tilt() with cupy
I am working through it to see what I changed, but I dont recall ever changing anything related to that test. I remember adding a test called test_inwave_fresnel in #402, but dont remember touching any other tests. Also, there are a couple minor changes I also made since yesterday which I will be pushing shortly. |
@kian1377 - FYI that development has been updated to settle failed tests. Fetch/Pull required. |
@BradleySappington , Are you talking about the issue with the scikit-image test failing? I haven't seen any commits or changes to that test so I am a little confused as to what development you are talking about. |
@kian1377 I was speaking to the CI tests that have been historically failing for reasons unrelated to your branch. Looks like you just pushed develop so some of the CI tests should now pass. |
Ok, I saw there was a conflict so I just resolved that and updated my repo by pulling some of the other commits. |
@mperrin , I installed scikit-image onto my desktop again and I am recreating the issue with the scikit-image test. Because this is is an issue for both CPU and GPU versions of the code, is it necessary to resolve this within this PR? I was thinking this issue should be resolved in a separate PR so there is a more clear record of its problem and the solution. Here is the current status of the tests on my desktop running with GPU. |
Bravo @kian1377 on getting the remainder of the tests passing! It's very nice to get the entire test suite passing again on GPU, and that gives a lot of confidence in the robustness of a PR this complicated. I completely concur with splitting out the failing Fresnel text (which doesn't even get ran by the Github Actions CI) to a separate task in #552. Let me give this one more review/reread in the next day or two (I was out last week), but I think we are very very close to pushing the merge button on this one :-) |
Sounds good to me. I put some comments in the code so it is easy to understand why I made those changes, but feel free to delete those comments to clean up the code as you review the changes. One of your initial concerns that I have not addressed are those repeated lines of code that update the accel_math settings in the initialization of objects because I continued switching between GPU and CPU while I was debugging the tests, so that was a feature I kept using. Let me know if there are aspects of the code that I should change. |
@mperrin One change that I just thought of that may make the code a little cleaner is instead of import cupy or numpy as _ncp, I could change it to just by xp. This way, anyone reading the code knows that xp is a stand in for np or cp. Just thought this may be a better practice to use so let me know if you want this to be changed. |
Hi @kian1377, yes, good suggestion. I read some and found that yes using (Remind me which code editor you're using? I remember you had asked me about PyCharm which I use. That sort of global rename refactoring is super straightforward to do in PyCharm, with automated code refactoring tools that are much more sophisticated than just a string search-and-replace. One of the many reasons I like to use PyCharm, FYI) |
@kian1377 I'm doing a full reread/review now. FYI, I assume this will be OK with you, but for minor cleanup like removing commented-out lines, I'm just going to do that myself and push to this branch; it's as easy for me to do that as it would be to flag those lines here in GitHub. To answer your earlier question about the repeated lines of code in the initialization of objects. I do think that's less than desirable to have blocks of repeated code. And isn't it a performance hit to have many many calls to update_math_settings which are most of the time going to be unnecessary? I don't want to get hung up on this, but I wonder if the use case of switching back and forth is rare enough or only needed in special cases for debugging, and so maybe it's not worth trying to add the extra complications to support switching arbitrarily in general? |
I just went ahead and replaced all _ncp instances with xp and fixed a couple bugs that came up after doing so. I agree that we should remove those repeated blocks of code that are for switching between packages, but I kept using them for debugging purposes so if we wait until everything else in the PR is ready to be merged, then I can delete those lines and do a final check to make sure it didn't break anything else. And feel free to delete some lines/comments that I had put in and push them directly. I just left those so it would be easier for you during code review. |
…_settings, even if that makes it harder to toggle between GPU and CPU calculation backends.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After lots of excellent work by @kian1377, plus several passes of close edits and review by me, I'm going to declare this good to go! A major enhancement to poppy with substantial speed improvements using GPU hardware.
The entire test suite passes using GPU hardware on my laptop. And likewise all tests pass using CPU, locally on my laptop and also on the Github Actions CI.
Comments below are just some minor notes and comments on places we could further tune in subsequent PRs later.
if isinstance(self.rotation, u.Quantity) and self.rotation.unit==u.degree: | ||
angle = -np.deg2rad(self.rotation).value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this change have anything to do with the GPU code? Or is it an unrelated bug fix? I guess the point is that you want the variable angle
to be a bare float rather than a Quantity in the end, yes?
In any case this could be generalized & simplified to
if isinstance(self.rotation, u.Quantity):
angle = -np.deg2rad(self.rotation.to_value(u.degree))
That will work for any input rotation unit.
elif _USE_CUPY: | ||
return cp.fft.ifftshift(x) | ||
else: | ||
return np.fft.ifftshift(x) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couldn't we simplify this whole part to use _ncp
, or xp
after that switch, to call the appropriate CPU/GPU code? In other words like:
else:
return xp.fft.ifftshift(x)
This is true for several places here in accel_math.py.
For now I'm choosing to leave this as-is - we want to get this PR merged in rather than keep polishing indefinitely :-)
poppy/accel_math.py
Outdated
if _USE_CUPY: ######################################################################### | ||
do_fft = cp.fft.fft2 if forward else cp.fft.ifft2 | ||
if normalization is None: | ||
normalization = 1./wavefront.shape[0] if forward else wavefront.shape[0] | ||
wavefront = do_fft(wavefront) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another place where we could simplify - I think we don't even need an _USE_CUPY branch of the if statement. The same case could handle plain numpy and cupy using the xp
approach.
Everything passes (including local tests on GPU hardware). For the record the one 'failing' CI check is the code coverage, which has a slight decrease in coverage. This is not surprising since there's no way currently to test GPU code on Github Actions. (See https://github.com/orgs/github/projects/4247/views/1?filterQuery=+GPU&pane=issue&itemId=4967370; this is a "future" feature for Github Actions, with no particular timeline announced) Good to merge! |
Thank you for doing a thorough review! Happy to have helped and contributed
to this package once more. I can certainly address bugs or more
general code updates in future pull requests. Have a great day!
…On Fri, Mar 24, 2023 at 12:54 PM Marshall Perrin ***@***.***> wrote:
*External Email*
Merged #499 <#499> into
develop.
—
Reply to this email directly, view it on GitHub
<#499 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AMCLT2W2L6MYRZOK3MKUDUTW5X3YFANCNFSM5R4HJSKA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
This is a bit more of an extensive PR as it seeks to add CuPy as an accelerated math option for many computations including FFTs, MFTs, and exponentials for propagation with the OpticalSystem or FresnelOpticalSystem classes. To start with, CuPy is a package for GPU accelerated computing. While POPPY has GPU computing options available with PyOpenCL and with pyculib (which has been deprecated by Numba in favor of CuPy), this implementation seeks to perform all calculations on the GPU until the end of propagation. This reduces time for calculations as arrays no longer need to be transferred between GPU memory and standard memory when performing different calculations.
CuPy has been designed to be very similar to Numpy and has many of the same features requiring the same syntax to use. An example is numpy.fft.fft2, which with CuPy is cupy.fft.fft2. The catch is that CuPy functions can not be used on Numpy arrays since Numpy arrays are stored on standard memory. As such, the implementation strategy was to use the statement “import cupy as np” when POPPY’s Config has been set to use CuPy. This makes performing all calculations on GPU more seamless as wavefront arrays for Wavefront or FresnelWavefront objects along with optical element phasors are automatically calculated as CuPy arrays.
In addition to using CuPy for basic computations of FFTs and exponentials, CuPy also enables the use of many SciPy functions through its cupyx functions. The cupyx functions can be imported much like many scipy functions with a statement such as “import cupyx.scipy.ndimage as ndimage”. So many functions for computing array rotations or interpolations are also imported through cupyx or scipy based on POPPY’s Config setup. The same method is also used to compute Bessel functions.
Currently, many optics have been tested for use with CuPy. A table with these optics is provided below with some comments regarding the functionality of some optics.
All tests in the test suite for POPPY were also run. Currently, there is only an issue with the KolmogorovWFE test not passing due to a units issue in an exponential. This only comes up when using a Tatarski power spectrum. The tests have only been run with standard CPU computations to make sure POPPY is not running into critical errors because of the addition of CuPy even if a user isn’t using the CuPy feature.
Computation comparisons have been performed to illustrate the benefit of this accelerated computing feature. Below are comparisons of the times required for a PSF to be calculated for varying array sizes using the MKL FFT option versus the CuPy calculations. The optical systems tested had 5 different surfaces/optics. The system used for these comparisons was the University of Arizona’s HPC Puma nodes. The node utilized 32 AMD EPYC 7642 CPUs and the NVIDIA Tesla V100S GPU.
One catch with this is none of POPPY’s display functionality is compatible with CuPy. This is because matplotlib cannot plot CuPy arrays since they are only on GPU memory. In order to obtain a Numpy array from a CuPy array, the “cupy.ndarray.get()” method can be used. So users can obtain intensity and phase arrays by adding .get().
It should be noted that the following POPPY features have not been tested for functionality with CuPy:
@douglase was also involved in the addition so I am including him in this PR.
Let me know your thoughts and what changes need to be made.