Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(still under development) PIT Metric (Port of Jive) #399

Draft
wants to merge 2 commits into
base: develop
Choose a base branch
from

Conversation

HCookie
Copy link
Collaborator

@HCookie HCookie commented May 17, 2024

Add PIT metric

Port of the jive code to the scores codebase

Initial port:

  • Copy of code
  • Initial Testing

Further work

  • 100% unit test coverage
  • Tutorial notebook

Documentation

Update 17/05/24

The initial work done on this issue was a simple copy of the original code within the Jive codebase. This should not be merged as it is not yet optimised nor scientifically checked. As this metric introduces a variety of utility functions for cdf's, these also need to be tested and checked.

@HCookie HCookie marked this pull request as draft May 17, 2024 01:24
@nicholasloveday
Copy link
Collaborator

@rob-taggart had some suggested improvements to PIT. I'll let Rob comment as to if it can be ported straight across and updated later or if it needs updating first.

@nikeethr
Copy link
Collaborator

nikeethr commented May 17, 2024

I'm using #142 as a dumping ground for scientific notes, to satisfy my own learning of the metric. While this pull request will mainly be focused on implementation.

Any information in #142 is not to be taken as "polished" and also not necessary in the implementation of this particular pull request. I'll go with @rob-taggart advice on what's required for the initial port.

If we need to split out issues/update/generalize the implementation further we will address it in future issues. The other approach is to completely dissect the code and re-purpose/re-design it to better map the data structure and operators to match the mathematical properties involved in the underlying computations from first principles - which will take a lot longer - but will naturally be well-suited for generalizations/extensions.

I'm leaning toward going with the simple approach of cleaning up the port for now based on advice on current state, especially if the use-case is fairly specific and will not be extended for the foreseeable future - let me know what you think @nicholasloveday / @tennlee ...

@tennlee
Copy link
Collaborator

tennlee commented May 19, 2024

Consider the option of putting a version in scores.emerging in the first instance if further redesign or changes are likely to still be needed.

@tennlee tennlee changed the title PIT Metric (Port of Jive) (still under development) PIT Metric (Port of Jive) May 20, 2024
@rob-taggart
Copy link
Collaborator

@nikeethr , thanks for all your work looking into PIT. I don't really have time to look into it now as I am currently working on three different projects. There may be opportunity to give some time to it in July or August, after one of those projects wraps up.

cc @nicholasloveday , @tennlee

@nikeethr
Copy link
Collaborator

Thanks for your comment @rob-taggart. In that case, I'll put an initial implementation in the emerging space as @tennlee suggested. We can revisit it to integrate it into the core library once you have more time.

@nikeethr nikeethr changed the title (still under development) PIT Metric (Port of Jive) (still under development, emerging) PIT Metric (Port of Jive) May 28, 2024
@nikeethr nikeethr changed the title (still under development, emerging) PIT Metric (Port of Jive) (still under development) PIT Metric (Port of Jive) May 28, 2024
@durgals
Copy link
Contributor

durgals commented Jul 30, 2024

In hydrology, PIT values are typically summarized using the alpha index, which quantifies the deviation from a uniform distribution. It would be good to have this score in future releases to appeal to a wider audience.
Reference: Renard, B., Kavetski, D., Kuczera, G., Thyer, M., Franks, S.W., 2010. Understanding predictive uncertainty in hydrologic modeling: The challenge of identifying input and structural errors. Water Resour. Res., 46(5): W05521.

@tennlee
Copy link
Collaborator

tennlee commented Jul 30, 2024

In hydrology, PIT values are typically summarized using the alpha index, which quantifies the deviation from a uniform distribution. It would be good to have this score in future releases to appeal to a wider audience. Reference: Renard, B., Kavetski, D., Kuczera, G., Thyer, M., Franks, S.W., 2010. Understanding predictive uncertainty in hydrologic modeling: The challenge of identifying input and structural errors. Water Resour. Res., 46(5): W05521.

Hi @durgals. Thanks very much for taking the time to leave a comment. It's helpful to get user feedback so we can get our priorities in line with user needs. PIT is on our to-do list. I'm not across this metric myself (we have other contributors who are). As such, I wanted to ask if there are any specific variations or considerations which you think are relevant for your needs? I'm aware sometimes there are nuances even within a published metric (or options which need to be supported) which can inform how useful an implementation is.

@durgals
Copy link
Contributor

durgals commented Jul 31, 2024

In hydrology, PIT values are typically summarized using the alpha index, which quantifies the deviation from a uniform distribution. It would be good to have this score in future releases to appeal to a wider audience. Reference: Renard, B., Kavetski, D., Kuczera, G., Thyer, M., Franks, S.W., 2010. Understanding predictive uncertainty in hydrologic modeling: The challenge of identifying input and structural errors. Water Resour. Res., 46(5): W05521.

Hi @durgals. Thanks very much for taking the time to leave a comment. It's helpful to get user feedback so we can get our priorities in line with user needs. PIT is on our to-do list. I'm not across this metric myself (we have other contributors who are). As such, I wanted to ask if there are any specific variations or considerations which you think are relevant for your needs? I'm aware sometimes there are nuances even within a published metric (or options which need to be supported) which can inform how useful an implementation is.

Hi @tennlee. Thank you for your prompt reply and for considering our feedback. Here are a few things to consider:

  1. An argument for plotting position, such as Hazen, Weibull, etc.
  2. Treatment of censoring thresholds (Wang, Q.J., Robertson, D.E., 2011. Multisite probabilistic forecasting of seasonal flows for streams with zero value occurrences. Water Resour. Res., 47: W02546).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants