Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dataexample update #1774

Merged
merged 31 commits into from
Oct 1, 2024
Merged

Dataexample update #1774

merged 31 commits into from
Oct 1, 2024

Conversation

hrobarts
Copy link
Contributor

@hrobarts hrobarts commented Apr 11, 2024

Changes

Use zenodo_get for the data download and validation

Testing you performed

Updated download_data tests

  • Mock zip file when zenodo_get is called
  • Create the mock zip file in the CIL data folder rather than tmp folder because we don't always have access to unzip files in the tmp folder

Related issues/links

depends on dvolgyes/zenodo_get#25

Checklist

  • I have performed a self-review of my code
  • I have added docstrings in line with the guidance in the developer guide
  • I have updated the relevant documentation
  • I have implemented unit tests that cover any new or modified functionality
  • CHANGELOG.md has been updated with any functionality change
  • Request review from all relevant developers
  • Change pull request label to 'Waiting for review'

Contribution Notes

Please read and adhere to the developer guide and local patterns and conventions.

  • The content of this Pull Request (the Contribution) is intentionally submitted for inclusion in CIL (the Work) under the terms and conditions of the Apache-2.0 License
  • I confirm that the contribution does not violate any intellectual property rights of third parties

@hrobarts hrobarts marked this pull request as ready for review June 28, 2024 09:33
@hrobarts hrobarts requested a review from gfardell June 28, 2024 09:40
Copy link
Member

@lauramurgatroyd lauramurgatroyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good, I have just suggested some changes to docs and I think the unit tests could be expanded a bit.

I think it would be nice to add further documentation with examples of how to use it. I just tested trying to get the walnut data locally.

I was expecting to be able to do dataexample.WALNUT.get but I think that was wrong - I had to use dataexample.WALNUT.download_data

scripts/requirements-test.yml Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
Wrappers/Python/cil/utilities/dataexample.py Outdated Show resolved Hide resolved
Wrappers/Python/cil/utilities/dataexample.py Outdated Show resolved Hide resolved
Wrappers/Python/test/test_dataexample.py Show resolved Hide resolved
Wrappers/Python/test/test_dataexample.py Show resolved Hide resolved
Wrappers/Python/test/test_dataexample.py Show resolved Hide resolved
@casperdcl casperdcl mentioned this pull request Sep 16, 2024
10 tasks
@hrobarts
Copy link
Contributor Author

It looks good, I have just suggested some changes to docs and I think the unit tests could be expanded a bit.

I think it would be nice to add further documentation with examples of how to use it. I just tested trying to get the walnut data locally.

I was expecting to be able to do dataexample.WALNUT.get but I think that was wrong - I had to use dataexample.WALNUT.download_data

Thank you for the review Laura! You can just use dataexample.WALNUT.get(data_dir) if the data is already in data_dir but otherwise you have to use dataexample.WALNUT.download_data we decided it's best to be really explicit if we're downloading large datasets to the users computer. I agree this is not super clear so I will add some examples.

@lauramurgatroyd
Copy link
Member

It looks good, I have just suggested some changes to docs and I think the unit tests could be expanded a bit.
I think it would be nice to add further documentation with examples of how to use it. I just tested trying to get the walnut data locally.
I was expecting to be able to do dataexample.WALNUT.get but I think that was wrong - I had to use dataexample.WALNUT.download_data

Thank you for the review Laura! You can just use dataexample.WALNUT.get(data_dir) if the data is already in data_dir but otherwise you have to use dataexample.WALNUT.download_data we decided it's best to be really explicit if we're downloading large datasets to the users computer. I agree this is not super clear so I will add some examples.

Ok, that makes sense, but yes examples would be great!

@hrobarts
Copy link
Contributor Author

I've updated the docs to describe what the remote datasets are and how to use them:

image

Also removed the test checking if the zenodo record exists.

recipe/meta.yaml Outdated Show resolved Hide resolved
scripts/create_local_env_for_cil_development.sh Outdated Show resolved Hide resolved
scripts/requirements-test.yml Outdated Show resolved Hide resolved
Wrappers/Python/cil/utilities/dataexample.py Outdated Show resolved Hide resolved
Wrappers/Python/cil/utilities/dataexample.py Outdated Show resolved Hide resolved
Wrappers/Python/cil/utilities/dataexample.py Outdated Show resolved Hide resolved
Wrappers/Python/cil/utilities/dataexample.py Show resolved Hide resolved
Signed-off-by: Casper da Costa-Luis <[email protected]>
Signed-off-by: Casper da Costa-Luis <[email protected]>
@hrobarts hrobarts merged commit b91c0eb into master Oct 1, 2024
11 checks passed
@hrobarts hrobarts deleted the dataexample_update branch October 1, 2024 12:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Status: WIP
Development

Successfully merging this pull request may close these issues.

3 participants