Releases: IntelPython/dpctl
v0.14.2
Added
- Added
dpctl.SyclDevice.partition_max_sub_devices
property #1005 - Added
dpctl.program.SyclKernel.max_sub_group_size
property #1028 - Implemented printing of
usm_ndarray
#1013, #1043, #1060 - Implemented support for advanced indexing for
dpctl.tensor.usm_ndarray
#1095, #1097, #1099, #1101 - Implemented support for platform listing in
dpctl.__main__
script #1014 - Improved performance of
dpctl.tensor.asnumpy
#1026 - Added
UsmNDArray_Make*
C-API for constructingdpctl.tensor.usm_ndarray
from native allocations #1050, #1067 - Added support for
dpctl.SyclDevice.native_vector_width_*
device descriptors #1075 - Added
dpctl::tensor::usm_ndarray::get_shape_vector
anddpctl::tensor::usm_ndarray::get_strides_vector
methods #1090
Changed
-
Removed
dpctl.select_host_device
,dpctl.has_host_device
,dpctl.SyclDevice.is_host
, anddpctl.SyclDevice.has_aspect_host
since support for host device has been removed in DPC++ 2023 and from SYCL 2020 spec #1028 -
usm_ndarray
is made writable by default #1012, and writable flag is now checked by__setitem__
. -
Added convenience signature for C++ utility function in "dpctl4pybind11.hpp" #1016
-
Improved error reported when attempting to submit kernel that uses a data-type unsupported by target device #1018, #1040
-
Updated C++ code to require DPC++ 2023.0.0 or newer #1028, #1066
-
The
dpctl.tensor.Device
class supportsprint_device_info
method #1029, equality comparison, and hashing #1048 -
Updated version of pybind11 used to 2.10.2 #1031
-
Improved internal utility responsible for reduction of iteration space dimensionality #1044, #1054
-
Changed return type of
DCPCTLUSM_GetPointerType
function in SyclInterface library #1061, #1065 -
Updated supported version of DLPack to 0.8 #1073
-
Implemented queue cache per context/device pair and deployed it in
dpctl.memory
,dpctl.tensor.from_dlpack
anddpctl.tensor
array creation functions #1076, #1079 -
Maintainance, CI work: #1001, #1009, #1011, #1024, #1030, #1032, #1035, #1037, #1039, #1041, #1045, #1047, #1055, #1057, #1059, #1068, #1070, #1074,#1077, #1078, #1081, #1084, #1085, #1088, #1086, #1092, #1093
Fixed
- Fixed error gh-998 in forming Python exception, #999.
- A small memory leak fixed, #1000
- Improved dtype support in
dpctl.tensor.full
, PR #1002 - Added missing header file #1008 fixing gh-1007
- Fixed a typo in device-specific dtype mapping #1015
- Fixed default device integer type to align with NumPy's behavior on Windows #1017
- Fixed unexpected overflow in
dpctl.tensor.linspace
when one of the parameters is the largest floating point value #1034 - Constructors
dpctl.tensor.empty
,dpctl.tensor.zeros
, andusm_ndarray
constructor itself no longer allow to create array with data-types not supported by targeted device #1042 - Fixed parameter validation in
dpctl.SyclQueue
constructor #1052 - Fixed
usm_type
of the resulting array indpctl.tensor.tril
anddpctl.tensor.triu
functions #1062 - Used DPC++ configuration files to ensure correct use of conda compiler toolchain on Linux #1072
- Fixed issue with empty argument of
dpctl.tensor.meshgrid
function #1080 - Fixed linking problem on Windows enabling
dpctl
to be functional on Windows for devices not supporting some data types #1083
Full Changelog: 0.14.0...0.14.2
v0.14.0
[0.14.0] - 11/18/2022
Added
- Implemented
dpctl.tensor.linspace
function from array-API #875. - Implemented
dpctl.tensor.eye
function from array-API #896. - Implemented
dpctl.tensor.tril
anddpctl.tensor.triu
functions from array-API #910. - Added data type objects to
dpctl.tensor
namespace,finfo
,iinfo
,can_cast
, andresult_type
functions #913. - Implemented
dpctl.tensor.meshgrid
creation function from array-API #920. - Implemented convenience class to represent output of
dpctl.tensor.usm_ndarray.flags
property #921. - Added new device attributes and kernel's device-specific attributes #894.
- Added
dpctl.utils.onetrace_enabled
context manager for targeted trace collection #903. - Added support for
stream
keyword in__dlpack__
method, enabling support for sendingusm_ndarray
using mpi4py #906. dpctl.tensor.asarray
can now transition data between incompatible devices, #951.- Introduced
"syclinterface/dpctl_sycl_types_casters.hpp"
header file with declaration of conversion routines between SYCL type pointers and SyclInterface library opaque pointers #960. - Added C-API to
dpctl.program.SyclKernel
anddpctl.program.SyclProgram
. Added type casters for new types to "dpctl4pybind11" and added an example demonstrating its use #970. - Introduced "dpctl/sycl.pxd" Cython declaration file to streamline use of SYCL functions from Cython, and added an example demonstrating its use #981.
- Added experimental support for sharing data allocated on sub-devices via dlpack #984.
- Added
dpctl.SyclDevice.sub_group_sizes
property to retrieve supported sizes of sub-group by the device #985.
Changed
- Improved queue compatibility testing in
dpctl.tensor
's implementation module #900. - Added automatic measurement of array-API conformance test suite in CI #901.
- Improved performance of array metadata transfer from host to device #912.
- Used
os.add_dll_directory
on Windows to ensure thatDPCTLSyclInterface
library can be found #918. - Refactored
dpctl.tensor
's implementation module #941 to streamline adding new functionality. Streamlineddpctl::tensor::usm_ndarray
class implementation. - Added debugging messaging in case when
DPCTLDynamicLib::getSymbol
encounters errors #956. - Updated code base according to changes in DPC++ compiler #952, #957, #958.
- Changed
dpctl
to use pybind11 2.10.1 #967. - Extended
dpctl.tensor.full
to accept 0d and higher dimensional arrays for fill-value parameter #982 and #995.
Fixed
- Improved SyclDevice constructor error message #893.
- Fixed issue gh-890 about
dpctl.tensor.reshape
function #915. - Fixed unexpected
UnboundLocalError
exception in #922. - Fixed bugs in
dpctl.tensor.arange
in #945. - Fixed issue with type inferencing in
dpctl.tensor.asarray
in #949. - Added missing docstrings for
dpctl.SyclDevice
properties #964.
v0.13.0
Added
- Implemented and deployed dedicated kernels for copying with casting #781, used in
__setitem__
, implementaion ofasarray
,dpctl.tensor.copy
functions. - Implemented dedicated copying kernel for
dpctl.tensor.reshape
function #810, added support forcopy
keyword #807. - Implemented dedicated kernel to copy with casting from
numpy.ndarray
intodpctl.tensor.usm_ndarray
#817. - Implemented
dpctl.tensor.permute_dims
function from array-API #787. - Implemented
dpctl.tensor.expand_dims
function from array-API #788. - Implemented
dpctl.tensor.squeeze
function from array-API #790. - Implemented
dpctl.tensor.broadcast_to
function from array-API #791. - Implemented
dpctl.tensor.broadcast_arrays
function from array-API #798. - Implemented
dpctl.tensor.flip
function from array-API #801. - Implemented
dpctl.tensor.usm_ndarray.mT
property per array-API #805. - Implemented
dpctl.tensor.roll
function from array-API #809. - Implemented
dpctl.tensor.arange
function from array-API #814. - Implemented
dpctl.tensor.zeros
function from array-API #816. - Implemented
dpctl.tensor.zeros
function from array-API #816. - Implemented
dpctl.tensor.ones
,dpctl.tensor.full
,dpctl.tensor.empty_like
,dpctl.tensor.zeros_like
,dpctl.tensor.ones_like
,dpctl.tensor.full_like
functions from array-API #822. - Implemented
DPCTLQueue_Memset
function in SyclInterface library #812, and exposed it fordpctl.memory.MemoryUSM*
classes #815. - Implemented
dpctl.utils.get_coerced_usm_type
to deduced usm type of the output array from types of input arrays in compute-follows-data execution model #797. - Added
dpctl.SyclDevice.profiling_timer_resolution
property #825. - Added
dpctl.SyclDevice.platform
anddpctl.SyclPlatform.default_context
properties #827. - Provided pybind11 example for functions working on
dpctl.tensor.usm_ndarray
container applying oneMKL functions #780, #793, #819. The example was expanded to demonstrate implementing iterative linear solvers (Chebyshev solver, and Conjugate-Gradient solver) by asynchronously submitting individual SYCL kernels from Python #821, #833, #838. - Wrote manual page about working with
dpctl.SyclQueue
#829. - Added cmake scripts to dpctl package layout and a way to query the location #853.
- Implemented
dpctl.tensor.concat
function from array-API #867. - Implemented
dpctl.tensor.stack
function from array-API #872.
Changed
- Enhanced coverage collection for SyclInterface library by also collecting it during pytest run and combining traces with those collected during C-test run #818. This change also allows to not rebuild SyclInterface library when building C-test executable.
- Exported
keep_args_alive
utility indpctl4pybind11.hpp
header #820. The utility usessycl::handler::host_task
to keep given Python arguments alive until eacsycl::event
from the given vector of events is complete. The host task is scheduled on the SYCL queue provided as the first argument. - Changed the size of struct underlying
dpctl.SyclEvent
to avoid storing Python object previously used to keep kernel arguments scheduled withdpctl.SyclQueue.submit
#823. - Fixed docstring for
dpctl.SyclTimer
#824. - Changed type of exceptions raised on failure to create
dpctl.SyclDevice
fromValueError
todpctl.SyclDeviceCreationError
#826. - Improved performance of pybind11 type casters #837.
- Changed implementation of
dpctl.SyclProgram
from using deprecatedsycl::program
tosycl::kernel_bundle
#845. - Removed deprecated device aspects, added new supported aspects #844.
- Updated vendored
dlpack.h
to version 0.7 #847.
Fixed
- Fixed
dpctl.lsplatform()
to work correctly when used from within Jupyter notebook #800. - Fixed script to drive debug build #835 and fixed code to compile in debug mode #836.
- Fixed filter selector string produced in outputs of
dpctl.lsplatform(verbosity=2)
anddpctl.SyclDevice.print_device_info
#866. - Fixed issue with slicing reported in gh-870 in #871.
New contributor: @npolina4 contributed #867, #872 and reported #870
v0.12.0
What's changed in 0.12.0
Added
- Properties added to MemoryUSM* objects. #647
- Added
dpctl.tensor.asarray
#646 - Implemented DLPack support for usm_ndarray #682
- Exported
dpctl.tensor.Device
class #708 #718 - Added testing of examples in CI #722
- Added user manuals to dpctl documentation #712 #773
Changed
- Folder dpctl-capi/ renamed to libsyclinterface/ in sources and documentation. #666
#768 - Added workflow to publish rendered documentation on PRs #673 #753 #726
- Synchronization functions and USM allocation functions release GIL #736 #766
dpctl.SyclEvent
destructor is made non-blocking #751
Fixed
- Fixed for issue in code of
dpctl.tensor.usm_ndarray.T
#653 - Fixed issue with
dpctl.tensor.reshape
's affect on contiguity flags of usm_ndarray #695 - Fixed handling of empty list by
dpctl.tensor.asarray
#694 - Fixed type inference with array of empty arrays in
dpctl.tensor.asarray
#697 - Fixed issue gh-698 with
dpctl.tensr.asarray
#709 - Fixed performance of item assignment from numpy array #724
DPCTLDeviceMgr_GetNumDevices
should not operate on rejected devices #737- Fixed issue gh-729 for
dpctl.tensor.reshape
applied to 0-element usm_ndarray #756 - Fixed issue gh-728 with
dpctl.tensor.astype
#757 - Fixed type in memory overlapping test #770
- Fixed issue with operator.pos for
dpctl.tensor.usm_ndarray
#783 - Only call
PyThread_Ensure
from host_task if the main-thread interpreter is initialized and not finalizing #776 #778 #721
Full Changelog: 0.11.4...0.12.0
0.11.4
What's Changed
- Fix tests for nested context factories expecting for integration environment by @PokhodenkoSA in #705
Full Changelog: 0.11.3...0.11.4
0.11.3
Fixed
Full Changelog: 0.11.2...0.11.3
0.11.2
Added
- Extending
dpctl.device_context
with nested contexts (#678)
Fixed
Full Changelog: 0.11.1...0.11.2
0.11.1
0.11.0
Added
- Use Python 3.9 in public CI (#599)
- Add a new C API utility function (
DPCTLDeviceMgr_GetDeviceInfoStr
) to return the device info as a C string object (#620) - New Github workflow to build dpclt with nightly Intel llvm/sycl + drivers (#621)
- Always raise SubDeviceCreationError even when sub-device counts are zero (#622)
- Updated OpenCL interoprability code to fix build with Intel llvm/sycl bundle (#625)
- Enabled use of default platform context extension in SYCL compilers that implement this extension (#627)
- Implemented
dpctl.utils.get_execution_queue(queue_seq)
utility to help implementing "compute-follows data" convention for offload target (#632) - Improved code coverage (#619) (#542) (#631)
Changed
- Replaced
host_device
device type withhost
in tests (#616) - Rework the logic in
dpctl.memory
'scopy_from_device
method to work correctly withhost
device (#618) - Use
dpctl.device_type.host
instead ofdpctl.device_type.host_device
(#626) - Reinstate deprecated
sycl::program
and that was conditionally removed from open source DPC++ toolchain (#633) - Use
LoadLibraryExA
instead ofLoadLibraryA
to mitigate a possible DLL injection issue when we load the Level zero DLL on windows (#636) - Github coverage workflow is changed to use oneAPI 2021.3 instead of latest to work around broken profiling instrumentation in DPC++ 2021.4 (#614)
- Update build dependencies for NumPy (#641)
- Use "readelf" on SYCL's
pi_level_zero
library to find out and use the exact name ofze_loader.so
in SyclInterface library (#617)
Removed
- Removed use of DPC++ features deprecated in 2021.4 and open source Intel llvm/sycl compiler (#603)
Fixed
- Suppress errant CMake log (#610)
- Fixes to compile dpctl using Intel llvm/sycl compiler (#603)
- Fix for the hang is to avoid passing
nullptr
argument tosycl::queue::prefetch
(#612) - Fixed the logic to return device count (#623)
- Enabled building of C extensions with dpctl by including header defining
bool
type for C compilers (#604)
0.11.0rc2
What's Changed
- Added new example of Python object exposing sycl_usm_array_interface by @oleksandr-pavlyk in #596
- Cleanup/c extensions with dpctl by @oleksandr-pavlyk in #604
- Silly typo behind slow copy by @oleksandr-pavlyk in #606
- Use Python 3.9 [in public CI] by @PokhodenkoSA in #599
- Suppress errant CMake log. by @diptorupd in #610
- Avoid passing nullptr to sycl::queue::prefetch to avoid buggy hang by @oleksandr-pavlyk in #612
- Moved repeated status message outside of if/else statement by @oleksandr-pavlyk in #613
- Coverage workflow is to use oneAPI 2021.3 by @oleksandr-pavlyk in #614
- replace host_device device type with host in test by @oleksandr-pavlyk in #616
- Use readelf on pi_level_zero library to find exact name of ze_loader by @oleksandr-pavlyk in #617
- Remove use of DPC++ features deprecated in 2021.4 and open source intel/llvm compiler by @oleksandr-pavlyk in #603
- Rework the logic in memory's copy_from_device by @oleksandr-pavlyk in #618
- Added tests to cover red lines in coverage report by @oleksandr-pavlyk in #619
- Add a new utility function to return the device info as a C string object. by @diptorupd in #620
- Always raise SubDeviceCreationError even when counts are zero by @oleksandr-pavlyk in #622
- Add a workflow to install nightly intel/llvm + drivers by @oleksandr-pavlyk in #621
- DPCTLSyclInterface should avoid functions that print to std::cout by @oleksandr-pavlyk in #542
- Correct the logic to return device count. by @diptorupd in #623
- dpctl.device_type.host_device -> dpctl.device_type.host by @oleksandr-pavlyk in #626
- Update opencl interoprability code to fix build with open source llvm/sycl bundle by @oleksandr-pavlyk in #625
- Enable use of default platform context extension by @oleksandr-pavlyk in #627
- [WIP] Rebuild by @Vyacheslav-Smirnov in #605
- test_get_dpcppversion relaxed by @oleksandr-pavlyk in #631
- reinstate deprecated sycl::program and that was conditionally removed from open source DPC++ toolchain by @oleksandr-pavlyk in #633
- Closes #629 by @oleksandr-pavlyk in #632
- Add triggers to CI by @1e-to in #634
- Undid restrictions for on: push triggers by @oleksandr-pavlyk in #637
- Change the way we load level zero DLL on windows. by @diptorupd in #636
- Update build deps for numpy by @xaleryb in #641
- master-> gold/2021 by @xaleryb in #642
Full Changelog: 0.10.0...0.11.0rc2