Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support .lindi.d directory representation (in addition to .lindi.tar) #92

Merged
merged 43 commits into from
Sep 17, 2024

Conversation

magland
Copy link
Collaborator

@magland magland commented Aug 28, 2024

This is builds on #89 for supporting a directory representation of a lindi file.

This behaves the same as .lindi.tar except instead of the files and blobs going into the .tar binary file, they go into a directory.

Here's an example

import numpy as np
import lindi

# Create a new lindi binary file
with lindi.LindiH5pyFile.from_lindi_file('example.lindi.d', mode='w') as f:
    f.attrs['attr1'] = 'value1'
    f.attrs['attr2'] = 7
    ds = f.create_dataset('dataset1', shape=(1000, 1000), dtype='f')
    ds[...] = np.random.rand(1000, 1000)

# Later read the file
with lindi.LindiH5pyFile.from_lindi_file('example.lindi.d', mode='r') as f:
    print(f.attrs['attr1'])
    print(f.attrs['attr2'])
    print(f['dataset1'][...])

Running this will create a directory with the following files

example.lindi.d/
  lindi.json
  blobs/
    dataset1/
      0.0

You can also start from a remote (DANDI) nwb or lindi.nwb file and then augment it using a local directory. This I think is more intuitive compared with the staging area method, and I would propose to replace that.

While I understand that there are advantages of having the blobs be contained in files that are themselves independently open-able, I believe this approach is the way to go when using tools such as pynwb. But this same structure could be used to support other approaches to making a modular NWB... with a tool other than pynwb.

@rly @oruebel

@codecov-commenter
Copy link

codecov-commenter commented Aug 28, 2024

Codecov Report

Attention: Patch coverage is 39.50762% with 516 lines in your changes missing coverage. Please review.

Project coverage is 69.09%. Comparing base (ff05704) to head (7be5fb3).

Files with missing lines Patch % Lines
lindi/tar/lindi_tar.py 10.49% 273 Missing ⚠️
lindi/LindiH5pyFile/LindiH5pyFile.py 54.33% 79 Missing ⚠️
lindi/tar/create_tar_header.py 3.70% 52 Missing ⚠️
...ndi/LindiH5pyFile/LindiReferenceFileSystemStore.py 58.77% 47 Missing ⚠️
lindi/tar/LindiTarStore.py 29.82% 40 Missing ⚠️
lindi/LindiH5ZarrStore/LindiH5ZarrStore.py 85.03% 19 Missing ⚠️
...ndi/conversion/create_zarr_dataset_from_h5_data.py 81.25% 3 Missing ⚠️
...indi/LindiH5pyFile/writers/LindiH5pyGroupWriter.py 0.00% 2 Missing ⚠️
lindi/LindiH5ZarrStore/_util.py 75.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##             main      #92       +/-   ##
===========================================
- Coverage   79.47%   69.09%   -10.38%     
===========================================
  Files          30       33        +3     
  Lines        2265     3032      +767     
===========================================
+ Hits         1800     2095      +295     
- Misses        465      937      +472     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@rly
Copy link
Contributor

rly commented Sep 17, 2024

@magland and I discussed and reviewed this PR over video call today. I approve of the general approach. And this PR / the pre-release works with Dendro. We decided to merge this now and make a 0.4.0 release to not block progress on downstream projects. I will do a more thorough review and we can add tests in a later PR.

@magland magland merged commit 2bc4007 into main Sep 17, 2024
6 checks passed
@magland magland deleted the lindi-dir branch September 17, 2024 17:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants