Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

eachchunk fall-back for transposed or offset arrays? #130

Open
Alexander-Barth opened this issue Oct 17, 2023 · 1 comment
Open

eachchunk fall-back for transposed or offset arrays? #130

Alexander-Barth opened this issue Oct 17, 2023 · 1 comment

Comments

@Alexander-Barth
Copy link
Contributor

This is low priority issue for me, but when DiskArrays are used for memory arrays, the it might be useful to adapt the default chunksize for transposed arrays:

julia> DiskArrays.default_chunk_size[] = 1
1

julia> eachchunk(ones(4000,4000))
1×130 DiskArrays.GridChunks{2, Tuple{DiskArrays.RegularChunks, DiskArrays.RegularChunks}}:
 (1:4000, 1:31)  (1:4000, 32:62)  (1:4000, 63:93)  …  (1:4000, 4000:4000)

julia> eachchunk(ones(4000,4000)')
1×130 DiskArrays.GridChunks{2, Tuple{DiskArrays.RegularChunks, DiskArrays.RegularChunks}}:
 (1:4000, 1:31)  (1:4000, 32:62)  (1:4000, 63:93)  …  (1:4000, 4000:4000)

For a transposed array I would expect that the first chunk size of (1:31, 1:4000). I guess one can use stride to check this.

I am note sure if OffsetArray is supposed to be sported but currently all indices from eachchunk are ones-based

using  DiskArrays, OffsetArray
DiskArrays.default_chunk_size[] = 1
OA = OffsetArray(ones(4000,4000),(-1,-1));
c = eachchunk(OA);
OA[last(c)...]
# ERROR: BoundsError: attempt to access 4000×4000 OffsetArray(::Matrix{Float64}, 0:3999, 0:3999) with eltype Float64 with indices 0:3999×0:3999 at index [1:4000, 4000:4000]

None of these issue affect me currently in any production code, but I thought it might be useful to notify you about it.

I am using DiskArrays v0.3.22

@rafaqz
Copy link
Collaborator

rafaqz commented Oct 18, 2023

Thanks, yes we should test on OffsetArray. Correctness fixes are probably more important at this stage, the cost of wrong stride in Array will be lost in the cost of reading from disk (assuming this mostly happens in a broadcast/zip with a real disk-based array). But yes one day we could correct for that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants