Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why a new package? #231

Closed
ParadaCarleton opened this issue Feb 10, 2023 · 4 comments
Closed

Why a new package? #231

ParadaCarleton opened this issue Feb 10, 2023 · 4 comments

Comments

@ParadaCarleton
Copy link

Why is this a new package, and not a PR to DimensionalData.jl or AxisSets or AxisArrays or one of the other dozen XArray packages in Julia?

@lazarusA
Copy link
Collaborator

Before my time, but taking a look at the repos it looks like:

  • the initial dev of YAXArrays was like 8 years ago.
  • DimensionalData ~ 4 years ago.
  • AxisSets 2 years ago.
  • AxisArrays 7 years ago.
  • AxisKeys 3 years ago.

hence, I would say that pretty much the development of YAXArrays has been happening in parallel to the other ones. Besides, nowadays a lot of the internals of YAXArrays have some component from those other packages. It combines the strengths from all of them into this one. The design and scope of those is quite different from YAXArrays.jl.
For more technical reasons maybe @felixcremer could jump in.

@felixcremer
Copy link
Member

Just for a bit more historical context. The functionality here has first been developed in the Catlab.jl and then in the ESDL.jl geospatial data cube packages. The development of these started as @lazarusA said eight years ago. The naming of this package comes exactly of this vast amount of packages with a similar functionality in Julia when we split the geospatial parts and the data handling parts. This was some three years ago and it was not clear, which approach is going to proliferate in the Julia ecosystem. Also I have not looked much in the other packages, but DimensionalData and YAXArrays took different trade offs initially. DimensionalData tries to make the array handling as cheap as possible and to have no overhead for the array handling. YAXArrays has a bit more overhead in the construction, but then tries to distribute the computation over more workers and threads.
Since DimensionalData has a much better developed and tested indexing interface we are planning to switch the underlying Array type to be a subtype of AbstractDimArray see #173.

@ParadaCarleton
Copy link
Author

ParadaCarleton commented Feb 19, 2023

I think that makes lots of sense, thanks! Mostly I'd like to avoid extra fragmentation. Having lots of different array labelling packages means compatibility is a nightmare and none of them get enough testing. But if this is doing something different for distributed computing, it makes sense to have it separate. It should be pretty easy to support as well, as long as it subtypes AbstractDimArray.

YAXArrays has a bit more overhead in the construction, but then tries to distribute the computation over more workers and threads.

In that case, would it maybe make sense to merge this into the better-known DimensionalData.jl for discoverability, or move them to a common organization and rename YAXArrays something like DistributedDimArrays?

@meggart
Copy link
Member

meggart commented Apr 18, 2023

In principle I agree that this functionality should be available directly through DimensionalData.jl. However, for legacy reasons we will still keep this package around for a while. However, I am currently building https://github.com/meggart/DiskArrayEngine.jl which provides the same distributed computing capabilities but is working directly at the DiskArray-level. This means it can easily be wrapped by DimensionalData and we are currently working on ways to integrate DiskArrayEngine as an optional alternative computing engine for DimensionalData.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants