Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Addition of CuPy as an Accelerated Computing Option #499

Merged
merged 63 commits into from
Mar 24, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
63 commits
Select commit Hold shift + click to select a range
8f159a9
modifications to use CuPy in Fresnel propagation
Feb 22, 2022
8890980
fixed issue with pad_or_crop_to_shape returning zeros with cupy
Feb 24, 2022
53e4e91
working on more elegant implementation of cupy
Mar 3, 2022
9fac1b5
cupy working with FITSOpticalElements, changed resampling techniques …
Mar 4, 2022
8a4a5f2
Fraunhofer systems and DM functionality with Cupy
Mar 8, 2022
07fd452
ContinuousDeformableMirror works with CuPy when given influence_func
Mar 10, 2022
f79bcfb
cupy integrated for more optics: particularly for WFE optics
Mar 17, 2022
03e40ed
minor modifications for a couple optics to work with cupy
Mar 18, 2022
d680b3e
Merge branch 'spacetelescope:develop' into develop
kian1377 Mar 23, 2022
c365e22
cupy now functional with most optical elements for Fraunhofer and Fre…
Mar 23, 2022
b6d71d3
minor update for SegmentedDeformableMirrors to be CuPy compatible
Mar 25, 2022
5f604e7
making sure tests pass, currently only Kolmogorov test does not pass
Mar 28, 2022
c091d60
modified KolmogorovWFE.power_spectrum and test_wfe to work with diffe…
Mar 29, 2022
eaae159
Merge branch 'spacetelescope:develop' into develop
kian1377 Apr 4, 2022
2128f71
minor update to rotation of analytic optics
May 13, 2022
f42e8ea
minor update to check if a GPU exists when CuPy available
May 26, 2022
6961c19
Merge branch 'spacetelescope:develop' into develop
kian1377 May 29, 2022
ddd9306
changes to allow for switching between CPU and GPU with CuPy
May 29, 2022
4c6588a
Merge branch 'develop' of https://github.com/kian1377/poppy into develop
May 29, 2022
70442dc
most functionality restored along with ability to switch between CPU …
May 30, 2022
1b9b131
a few minor changes for scipy ndimage and special functions
May 31, 2022
9457f8c
minor update for accel_math test and cleaning up code
May 31, 2022
fcc8ccd
CuPy update for matrixDFT and FITSOpticalElement resample
May 31, 2022
8ba5a6c
FITSOpticalElement change for when only OPD is supplied to be CuPy co…
May 31, 2022
34afa79
implemneted map_coordinates for wavefront resampling with cupy and mi…
Jun 2, 2022
d6059af
Merge branch 'spacetelescope:develop' into develop
kian1377 Jun 6, 2022
b697e07
updated zernike.py so switching between CPU and CuPy works for Zernik…
Jun 7, 2022
298f1cf
minor fixes for BaseWavefront and zernike.arbitrary_basis
Jun 16, 2022
9fe56af
minor update for zernike.py functionality
kian1377 Sep 26, 2022
d005c2c
had to delete a line
kian1377 Sep 26, 2022
5b7c460
Merge branch 'spacetelescope:develop' into develop
kian1377 Feb 15, 2023
cb10c6b
fixed geometry.py _ncp.float to _ncp.float64 as float was deprecated
kian1377 Feb 16, 2023
ec6465d
Allow display code to handle GPU arrays, transferring data to main me…
mperrin Feb 18, 2023
b1ae90b
get test_core passing on GPU
mperrin Feb 18, 2023
e615458
get test_accel_math and test_utils passing
mperrin Feb 18, 2023
47372da
Fix subtle indexing inconsistency in Fresnel resampling function.
mperrin Feb 21, 2023
0c32d7d
Get test_fresnel passing
mperrin Feb 21, 2023
6090180
get tests working in several more files
mperrin Feb 21, 2023
dda712e
get test_instruments passing on GPU
mperrin Feb 21, 2023
6de49a2
get test_matrixDFT passing on GPU
mperrin Feb 21, 2023
9e01ba2
get test_optics passing on GPU
mperrin Feb 23, 2023
f4d6d8d
get test_geometry passing on GPU
mperrin Feb 23, 2023
7e428d8
get tests_multiprocessing 'passing' (really, mostly skipped) on windows
mperrin Feb 23, 2023
12cca34
get test_sign_conventions passing on GPU
mperrin Feb 23, 2023
29e4f0c
get test_zernike passing on GPU
mperrin Feb 23, 2023
906d61b
fix one test_fresnel function that was missed before
mperrin Feb 23, 2023
1cc7650
getting test_fft to pass
kian1377 Feb 23, 2023
bc131a2
made numexpr be imported if NUMEXPR_AVAILABLE and cleaned up commente…
kian1377 Feb 23, 2023
fa7b4d3
debugging issues with test_wfe on gpu
kian1377 Feb 24, 2023
922e3d6
getting physical wavefront and some more wfe tests to pass on GPU
kian1377 Mar 1, 2023
d33b8a5
all tests now pass on GPU on my machine
kian1377 Mar 2, 2023
b63bc3c
minor changes to the wfe and test_wfe files
kian1377 Mar 3, 2023
5affeec
Merge branch 'develop' into develop
kian1377 Mar 10, 2023
d2edb25
making sure everything is up to date with local repo
kian1377 Mar 10, 2023
e5c02e7
renamed _ncp to xp to be consistent with common practices
kian1377 Mar 17, 2023
02e3889
minor: cleanup comments and whitespace
mperrin Mar 24, 2023
8cb000d
clean up / simplify imports. Avoid proliferating calls to update_math…
mperrin Mar 24, 2023
4ea0d07
more whitespace and comment cleanup
mperrin Mar 24, 2023
4699058
switch to xp convention for numpy/cupy in all files, including tests
mperrin Mar 24, 2023
105b83a
Merge branch 'develop' into develop
mperrin Mar 24, 2023
c71e2d2
fix syntax error typo from prior commit
mperrin Mar 24, 2023
4c38bd8
fix to run update_math_settings once on initial import
mperrin Mar 24, 2023
c2f925b
one more round of comment/whitespace cleanup for GPU code
mperrin Mar 24, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion poppy/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,6 @@

__minimum_python_version__ = "3.9"


class UnsupportedPythonError(Exception):
pass

Expand Down Expand Up @@ -68,6 +67,8 @@ class Conf(_config.ConfigNamespace):
'is available)?')
use_opencl = _config.ConfigItem(True, 'Use OpenCL for FFTs on GPU (assuming it' +
'is available)?')
use_cupy = _config.ConfigItem(True, 'Use CuPy for FFTs on GPU (assuming it' +
'is available)?')
use_numexpr = _config.ConfigItem(True, 'Use NumExpr to accelerate array math (assuming it' +
'is available)?')

Expand Down Expand Up @@ -102,6 +103,8 @@ class Conf(_config.ConfigNamespace):

conf = Conf()

from . import accel_math # This should be the first import here, to ensure math accelerators and settings are loaded prior to
# the rest of poppy code being loaded.
from . import poppy_core
from . import utils
from . import optics
Expand Down
87 changes: 74 additions & 13 deletions poppy/accel_math.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
# Various functions related to accelerated computations using FFTW, CUDA, numexpr, and related.
#
import numpy as np
import scipy
import multiprocessing
import matplotlib.pyplot as plt
from . import conf
Expand All @@ -11,7 +12,6 @@
import logging
_log = logging.getLogger('poppy')


try:
# try to import FFTW to see if it is available
import pyfftw
Expand Down Expand Up @@ -60,23 +60,54 @@
except ImportError:
_OPENCL_AVAILABLE = False

try:
# try to import cupy packages to see if they are is available
# and check if GPU hardware is available
import cupy as cp
import cupyx.scipy.ndimage
import cupyx.scipy.signal
import cupyx.scipy.special
cp.cuda.Device() # checks if a GPU exists
_CUPY_PLANS = {}
_CUPY_AVAILABLE = True
except:
cp = None
_CUPY_AVAILABLE = False

_USE_CUPY = (conf.use_cupy and _CUPY_AVAILABLE)
_USE_CUDA = (conf.use_cuda and _CUDA_AVAILABLE)
_USE_OPENCL = (conf.use_opencl and _OPENCL_AVAILABLE)
_USE_NUMEXPR = (conf.use_numexpr and _NUMEXPR_AVAILABLE)
_USE_NUMEXPR = (conf.use_numexpr and _NUMEXPR_AVAILABLE and not _USE_CUPY)
_USE_FFTW = (conf.use_fftw and _FFTW_AVAILABLE)
_USE_MKL = (conf.use_mkl and _MKLFFT_AVAILABLE)

xp = np
_scipy = scipy

def update_math_settings():
""" Update the module-level math flags, based on user settings
"""
global _USE_CUDA, _USE_OPENCL, _USE_NUMEXPR, _USE_FFTW, _USE_MKL
global _USE_CUPY, _USE_CUDA, _USE_OPENCL, _USE_NUMEXPR, _USE_FFTW, _USE_MKL
global xp, _scipy

_USE_CUPY = (conf.use_cupy and _CUPY_AVAILABLE)
_USE_CUDA = (conf.use_cuda and _CUDA_AVAILABLE)
_USE_OPENCL = (conf.use_opencl and _OPENCL_AVAILABLE)
_USE_NUMEXPR = (conf.use_numexpr and _NUMEXPR_AVAILABLE)
_USE_NUMEXPR = (conf.use_numexpr and _NUMEXPR_AVAILABLE and not _USE_CUPY)
_USE_FFTW = (conf.use_fftw and _FFTW_AVAILABLE)
_USE_MKL = (conf.use_mkl and _MKLFFT_AVAILABLE)

if _USE_CUPY:
xp = cp
_scipy = cupyx.scipy
else:
xp = np
_scipy = scipy

# Call update_math_settings once on initial import
# This ensures an initial setup is done prior to any code execution
update_math_settings()


def _float():
""" Returns numpy data type for desired precision based on configuration """
Expand Down Expand Up @@ -107,6 +138,8 @@ def _exp(x):
"""
if _USE_NUMEXPR:
return ne.evaluate("exp(x)", optimization='moderate', )
elif _USE_CUPY:
return cp.exp(x)
else:
return np.exp(x)

Expand All @@ -128,6 +161,8 @@ def _fftshift(x):
numBlocks = (int(N/blockdim[0]),int(N/blockdim[1]))
cufftShift_2D_kernel[numBlocks, blockdim](x.ravel(),N)
return x
elif _USE_CUPY:
return cp.fft.fftshift(x)
else:
return np.fft.fftshift(x)

Expand All @@ -153,6 +188,8 @@ def _ifftshift(x):
numBlocks = (int(N/blockdim[0]),int(N/blockdim[1]))
cufftShift_2D_kernel[numBlocks, blockdim](x.ravel(),N)
return x
elif _USE_CUPY:
return cp.fft.ifftshift(x)
else:
return np.fft.ifftshift(x)
Comment on lines +191 to 194
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't we simplify this whole part to use _ncp, or xp after that switch, to call the appropriate CPU/GPU code? In other words like:

else:
    return xp.fft.ifftshift(x)

This is true for several places here in accel_math.py.

For now I'm choosing to leave this as-is - we want to get this PR merged in rather than keep polishing indefinitely :-)


Expand Down Expand Up @@ -192,7 +229,7 @@ def fft_2d(wavefront, forward=True, normalization=None, fftshift=True):

"""
## To use a fast FFT, it must both be enabled and the library itself has to be present
global _USE_OPENCL, _USE_CUDA # need to declare global in case we need to change it, below
global _USE_OPENCL, _USE_CUDA, _USE_CUPY # need to declare global in case we need to change it, below
t0 = time.time()

# OpenCL cfFFT only can FFT certain array sizes.
Expand All @@ -202,10 +239,12 @@ def fft_2d(wavefront, forward=True, normalization=None, fftshift=True):
_USE_OPENCL = False

# This annoyingly complicated if/elif is just for the debug print statement
if _USE_CUDA:
method = 'pyculib (CUDA GPU)'
if _USE_CUPY:
method = 'cupy (GPU)'
elif _USE_OPENCL:
method = 'pyopencl (OpenCL GPU)'
elif _USE_CUDA:
method = 'pyculib (CUDA GPU)'
elif _USE_MKL:
method = 'mkl_fft'
elif _USE_FFTW:
Expand Down Expand Up @@ -255,7 +294,13 @@ def fft_2d(wavefront, forward=True, normalization=None, fftshift=True):
event.wait()
wavefront[:] = wf_on_gpu.get()
del wf_on_gpu


if _USE_CUPY:
do_fft = cp.fft.fft2 if forward else cp.fft.ifft2
if normalization is None:
normalization = 1./wavefront.shape[0] if forward else wavefront.shape[0]
wavefront = do_fft(wavefront)

elif _USE_MKL:
# Intel MKL is a drop-in replacement for numpy fft but much faster
do_fft = mkl_fft.fft2 if forward else mkl_fft.ifft2
Expand Down Expand Up @@ -412,14 +457,14 @@ def benchmark_fft(npix=2048, iterations=20, double_precision=True):
print("Timing performance of FFT for {npix} x {npix}, {complextype}, with {iterations} iterations".format(
npix=npix, iterations=iterations, complextype=complextype))

defaults = (poppy.conf.use_mkl, poppy.conf.use_fftw, poppy.conf.use_numexpr, poppy.conf.use_cuda,
defaults = (poppy.conf.use_mkl, poppy.conf.use_fftw, poppy.conf.use_numexpr, poppy.conf.use_cupy,
poppy.conf.use_opencl, poppy.conf.double_precision)
poppy.conf.double_precision = double_precision

# Time baseline performance in numpy
print("Timing performance in plain numpy:")

poppy.conf.use_mkl, poppy.conf.use_fftw, poppy.conf.use_numexpr, poppy.conf.use_cuda, poppy.conf.use_opencl = (False, False, False, False, False)
poppy.conf.use_mkl, poppy.conf.use_fftw, poppy.conf.use_numexpr, poppy.conf.use_cupy, poppy.conf.use_opencl = (False, False, False, False, False)
update_math_settings()
time_numpy = timer.timeit(number=iterations) / iterations
print(" {:.3f} s".format(time_numpy))
Expand Down Expand Up @@ -474,14 +519,14 @@ def benchmark_fft(npix=2048, iterations=20, double_precision=True):
time_opencl = np.NaN


poppy.conf.use_mkl, poppy.conf.use_fftw, poppy.conf.use_numexpr, poppy.conf.use_cuda,\
poppy.conf.use_mkl, poppy.conf.use_fftw, poppy.conf.use_numexpr, poppy.conf.use_cupy,\
poppy.conf.use_opencl, poppy.conf.double_precision = defaults

return {'numpy': time_numpy,
'fftw': time_fftw,
'numexpr': time_numexpr,
'mkl': time_mkl,
'cuda': time_cuda,
'cupy': time_cuda,
'opencl': time_opencl}


Expand Down Expand Up @@ -694,4 +739,20 @@ def test_mft_numexpr(array, npix=64):
plt.title(f"Matrix Fourier Transform timings\n{cpu_label}", fontweight='bold')

if savefig:
plt.savefig(f"bench_mfts.png")
plt.savefig(f"bench_mfts.png")


def is_on_gpu(array):
"""Simple utility function to check if an array is on the GPU
(only possible if using CuPY currently)
"""
if _USE_CUPY:
if isinstance(array, cp.ndarray):
return True
return False

def ensure_not_on_gpu(array):
"""Utility function to ensure an array is in CPU memory,
copying it from the GPU memory if necessary
"""
return array.get() if is_on_gpu(array) else array
Loading