-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Why a new package? #231
Comments
Before my time, but taking a look at the repos it looks like:
hence, I would say that pretty much the development of YAXArrays has been happening in parallel to the other ones. Besides, nowadays a lot of the internals of YAXArrays have some component from those other packages. It combines the strengths from all of them into this one. The design and scope of those is quite different from YAXArrays.jl. |
Just for a bit more historical context. The functionality here has first been developed in the Catlab.jl and then in the ESDL.jl geospatial data cube packages. The development of these started as @lazarusA said eight years ago. The naming of this package comes exactly of this vast amount of packages with a similar functionality in Julia when we split the geospatial parts and the data handling parts. This was some three years ago and it was not clear, which approach is going to proliferate in the Julia ecosystem. Also I have not looked much in the other packages, but DimensionalData and YAXArrays took different trade offs initially. DimensionalData tries to make the array handling as cheap as possible and to have no overhead for the array handling. YAXArrays has a bit more overhead in the construction, but then tries to distribute the computation over more workers and threads. |
I think that makes lots of sense, thanks! Mostly I'd like to avoid extra fragmentation. Having lots of different array labelling packages means compatibility is a nightmare and none of them get enough testing. But if this is doing something different for distributed computing, it makes sense to have it separate. It should be pretty easy to support as well, as long as it subtypes
In that case, would it maybe make sense to merge this into the better-known DimensionalData.jl for discoverability, or move them to a common organization and rename YAXArrays something like |
In principle I agree that this functionality should be available directly through DimensionalData.jl. However, for legacy reasons we will still keep this package around for a while. However, I am currently building https://github.com/meggart/DiskArrayEngine.jl which provides the same distributed computing capabilities but is working directly at the DiskArray-level. This means it can easily be wrapped by DimensionalData and we are currently working on ways to integrate DiskArrayEngine as an optional alternative computing engine for DimensionalData. |
Why is this a new package, and not a PR to DimensionalData.jl or AxisSets or AxisArrays or one of the other dozen XArray packages in Julia?
The text was updated successfully, but these errors were encountered: