Replies: 1 comment
-
Right now, nullable data types are not supported in the latest release, as it is not supported by NumPy. Masked arrays in numpy are a completely different numpy submodule that is also not yet supported, but it seems to construct two arrays instead of one. As of PR #1324 DaCe supports marshalling any type that creates a contiguous memory block (i.e., that can be converted to C arrays when we generate C/C++ code). If Apache Arrow supports such a mode, then it can be implemented (see create_datadescriptor for how we marshal types from libraries such as numpy/cupy/pytorch). It is unclear to me what the datatype of a nullable array is (is it a struct of In conclusion, neither pyarrow (and arrow) nor |
Beta Was this translation helpful? Give feedback.
-
Does DaCe supports Apache Arrow as data source? While not explicitly stated in the documentation, numpy appears to be the only supported "array format". The reason for using Arrow array / table is that all of its data types are nullable while numpy has no direct support for nullable data type (except floating point numbers). I'm very new to DaCe, a brief explanation on why DaCe can/cannot support Apache Arrow would be very helpful.
In case it is not immediately available to support Arrow columnar format, a feasible workaround for the nullability issue is to use np.MaskedArray. How is DaCe's support for numpy's masked array?
Beta Was this translation helpful? Give feedback.
All reactions