Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds twcrps for ensembles #644

Merged
merged 23 commits into from
Sep 10, 2024

Conversation

nicholasloveday
Copy link
Collaborator

@nicholasloveday nicholasloveday commented Aug 26, 2024

About

This pull request includes:

  • twCRPS for ensembles implementation
  • A twCRPS convenience function tail_twcrps_for_ensembles for threshold weighting of the tail
  • improvements to the crps_for_ensembles tests (it can now take xr.datasets and broadcasting is tested)

Tutorial

The tutorial is expected to be completed in a separate pull request.

Further extensions

In the future, I can to add other convenience functions for different kinds of threshold weighting (e.g., interval)

Checklist

Development for new xarray-based metrics

  • Works with n-dimensional data and includes reduce_dims, preserve_dims, and weights args.
  • Typehints added
  • Docstrings complete and followed Napoleon (google) style
  • Reference to paper/webpage is in docstring
  • Add error handling
  • Imported into the API

Testing of new xarray-based metrics

  • 100% unit test coverage
  • Test that metric is compatible with dask.
  • Test that metrics work with inputs that contain NaNs
  • Test that broadcasting with xarray works
  • Test both reduce and preserve dims arguments work
  • Test that errors are raised as expected
  • Test that it works with both xr.Dataarrays and xr.Datasets

Documentation

@nicholasloveday nicholasloveday changed the title (In progress) 629 twcrps for ensembles (Ready for review) 629 twcrps for ensembles Aug 26, 2024
@nicholasloveday
Copy link
Collaborator Author

I should also mention that I compared the results from tail_twcrp_for_ensemble to the crps_cdf implementation with equivalent weights and found that the results converged with increasing sample size (even more so with method="fair")

Copy link
Collaborator

@rob-taggart rob-taggart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good @nicholasloveday . Some suggestions for docstrings

src/scores/probability/crps_impl.py Outdated Show resolved Hide resolved
src/scores/probability/crps_impl.py Outdated Show resolved Hide resolved
src/scores/probability/crps_impl.py Show resolved Hide resolved
src/scores/probability/crps_impl.py Outdated Show resolved Hide resolved
src/scores/probability/crps_impl.py Outdated Show resolved Hide resolved
src/scores/probability/crps_impl.py Outdated Show resolved Hide resolved
docs/api.md Outdated Show resolved Hide resolved
Copy link
Collaborator

@Steph-Chong Steph-Chong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Allen et al. (2023) (https://doi.org/10.1137/22M1532184) is paywalled. (Except for the supplementary materials, which are freely available). Is there also a non-paywalled reference available for these metrics? (I realise there isn't always a good, non-paywalled reference available).

  2. I was looking at how Tail twCRPS for Ensembles and twCRPS for Ensembles appear in included.md when rendered (see: https://scores.readthedocs.io/en/228-documentation-testing-branch/included.html). They look good as is. However, I was wondering, should they perhaps be included as nested items underneath "Continuous Ranked Probability Score (CRPS) for Ensembles" - in a similar fashion to how the CRPS for CDFs functions appear in included.md? I really have no opinion on this - I imagine it is fine either way - I just wanted to raise the question.

Copy link
Collaborator

@tennlee tennlee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did this a number of days ago and didn't realise it hadn't shown. Apologies.

README.md Outdated Show resolved Hide resolved
src/scores/probability/crps_impl.py Outdated Show resolved Hide resolved
src/scores/probability/crps_impl.py Outdated Show resolved Hide resolved
src/scores/probability/crps_impl.py Show resolved Hide resolved
src/scores/probability/crps_impl.py Show resolved Hide resolved
src/scores/probability/crps_impl.py Outdated Show resolved Hide resolved
src/scores/probability/crps_impl.py Outdated Show resolved Hide resolved
src/scores/probability/crps_impl.py Outdated Show resolved Hide resolved
@nicholasloveday
Copy link
Collaborator Author

  1. Allen et al. (2023) (https://doi.org/10.1137/22M1532184) is paywalled. (Except for the supplementary materials, which are freely available). Is there also a non-paywalled reference available for these metrics? (I realise there isn't always a good, non-paywalled reference available).

    1. I was looking at how Tail twCRPS for Ensembles and twCRPS for Ensembles appear in included.md when rendered (see: https://scores.readthedocs.io/en/228-documentation-testing-branch/included.html). They look good as is. However, I was wondering, should they perhaps be included as nested items underneath "Continuous Ranked Probability Score (CRPS) for Ensembles" - in a similar fashion to how the CRPS for CDFs functions appear in included.md? I really have no opinion on this - I imagine it is fine either way - I just wanted to raise the question.
  1. There is a preprint on arXiv https://arxiv.org/abs/2202.12732 . Many of our references in scores are behind paywalls, but by Googling the title you can usually find the PDF. I'll leave this as is until we make a decision to additionally add links to non-paywalled articles everywhere, but I definitely think that it's worthwhile to consider.
  2. Potentially. What I have done is consistent with how twMSE and MSE appear in the documentation. I'll leave it as is unless someone has a strong opinion on it.

@nicholasloveday
Copy link
Collaborator Author

Thank you for those helpful reviews. I have pushed up the changes.

src/scores/probability/crps_impl.py Outdated Show resolved Hide resolved
src/scores/probability/crps_impl.py Outdated Show resolved Hide resolved
src/scores/probability/crps_impl.py Outdated Show resolved Hide resolved
src/scores/probability/crps_impl.py Outdated Show resolved Hide resolved
src/scores/probability/crps_impl.py Outdated Show resolved Hide resolved
@tennlee tennlee changed the title (Ready for review) 629 twcrps for ensembles twcrps for ensembles Sep 10, 2024
@tennlee tennlee changed the title twcrps for ensembles Adds twcrps for ensembles Sep 10, 2024
@tennlee tennlee merged commit 73d46e8 into nci:develop Sep 10, 2024
11 checks passed
@tennlee tennlee mentioned this pull request Sep 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants