Skip to content

Commit

Permalink
clean up docstring formatting and type hints (#16)
Browse files Browse the repository at this point in the history
* docs: align docstrings with napoleon (google) standards
* refactor: replace relative imports with pkg level imports
* docs: updating contributing notes to include napoleon docstring style
* docs: review amendments to docstrings/types
* docs: update coords_increasing check docstring
* docs: gather dims union types and mse types

---------

Signed-off-by: Aidan Griffiths <[email protected]>
Co-authored-by: agriffit <[email protected]>
  • Loading branch information
aidanjgriffiths and agriffit authored Aug 17, 2023
1 parent 5f62616 commit 68875f1
Show file tree
Hide file tree
Showing 8 changed files with 257 additions and 228 deletions.
6 changes: 3 additions & 3 deletions docs/contributing.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,14 +30,14 @@ A new score or metric should be developed on a separate feature branch, rebased
- The implementation of the new metric or score in xarray, ideally with support for pandas and dask
- 100% unit test coverage
- A tutorial notebook showcasing the use of that metric or score, ideally based on the standard sample data
- API documentation (docstrings) which clearly explain the use of the metrics
- API documentation (docstrings) using [Napoleon (google)](https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_google.html) style, making sure to clearly explain the use of the metrics
- A reference to the paper which described the metrics, added to the API documentation
- For metrics which do not have a paper reference, an online source or reference should be provided
- For metrics which are still under development or which have not yet had an academic publication, they will be placed in a holding area within the API until the method has been properly published and peer reviewed (i.e. `scores.emerging`). The 'emerging' area of the API is subject to rapid change, still of sufficient community interest to include, similar to a 'preprint' of a score or metric.

All merge requests should comply with the coding standards outlined in this document. Merge requests will undergo both a code review and a science review. The code review will focus on coding style, performance and test coverage. The science review will focus on the mathematical correctness of the implementation and the suitability of the method for inclusion within 'scores'.
All merge requests should comply with the coding standards outlined in this document. Merge requests will undergo both a code review and a science review. The code review will focus on coding style, performance and test coverage. The science review will focus on the mathematical correctness of the implementation and the suitability of the method for inclusion within 'scores'.

A github ticket should be created explaining the metric which is being implemented and why it is useful.
A github ticket should be created explaining the metric which is being implemented and why it is useful.

### Development Process for a Correction or Improvement

Expand Down
107 changes: 58 additions & 49 deletions src/scores/continuous.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,31 +6,38 @@


def mse(fcst, obs, reduce_dims=None, preserve_dims=None, weights=None):
"""
"""Calculates the mean squared error from forecast and observed data.
Returns:
- By default an xarray containing a single floating point number representing the mean absolute
error for the supplied data. All dimensions will be reduced.
- Otherwise: Returns an xarray representing the mean squared error, reduced along
the relevant dimensions and weighted appropriately.
Dimensional reduction is not supported for pandas and the user should
convert their data to xarray to formulate the call to the metric. At
most one of reduce_dims and preserve_dims may be specified.
Specifying both will result in an exception.
Args:
- fcst: Forecast or predicted variables in xarray or pandas
- obs: Observed variables in xarray or pandas
- reduce_dims: Optionally specify which dimensions to reduce when calculating MSE.
All other dimensions will be preserved.
- preserve_dims: Optionally specify which dimensions to preserve when calculating MSE. All other
dimensions will be reduced. As a special case, 'all' will allow all dimensions to
be preserved. In this case, the result will be in the same shape/dimensionality as
the forecast, and the errors will be the squared error at each point (i.e. single-value
comparison against observed), and the forecast and observed dimensions must match
precisely.
- weights: Not yet implemented. Allow weighted averaging (e.g. by area, by latitude, by population, custom)
Notes:
- Dimensional reduction is not supported for pandas and the user should convert their data to xarray
to formulate the call to the metric.
- At most one of reduce_dims and preserve_dims may be specified. Specifying both will result in an exception.
fcst (Union[xr.Dataset, xr.DataArray, pd.Dataframe, pd.Series]):
Forecast or predicted variables in xarray or pandas.
obs (Union[xr.Dataset, xr.DataArray, pd.Dataframe, pd.Series]):
Observed variables in xarray or pandas.
reduce_dims (Union[str, Iterable[str]): Optionally specify which
dimensions to reduce when calculating MSE. All other dimensions
will be preserved.
preserve_dims (Union[str, Iterable[str]): Optionally specify which
dimensions to preserve when calculating MSE. All other dimensions
will be reduced. As a special case, 'all' will allow all dimensions
to be preserved. In this case, the result will be in the same
shape/dimensionality as the forecast, and the errors will be
the squared error at each point (i.e. single-value comparison
against observed), and the forecast and observed dimensions
must match precisely.
weights: Not yet implemented. Allow weighted averaging (e.g. by
area, by latitude, by population, custom)
Returns:
Union[xr.Dataset, xr.DataArray, pd.Dataframe, pd.Series]: An object containing
a single floating point number representing the mean absolute
error for the supplied data. All dimensions will be reduced.
Otherwise: Returns an object representing the mean squared error,
reduced along the relevant dimensions and weighted appropriately.
"""

error = fcst - obs
Expand All @@ -53,38 +60,40 @@ def mse(fcst, obs, reduce_dims=None, preserve_dims=None, weights=None):


def mae(fcst, obs, reduce_dims=None, preserve_dims=None, weights=None):
"""**Needs a 1 liner function description**
"""Calculates the mean absolute error from forecast and observed data.
A detailed explanation is on [Wikipedia](https://en.wikipedia.org/wiki/Mean_absolute_error)
Dimensional reduction is not supported for pandas and the user should
convert their data to xarray to formulate the call to the metric.
At most one of reduce_dims and preserve_dims may be specified.
Specifying both will result in an exception.
Args:
- fcst: Forecast or predicted variables in xarray or pandas.
- obs: Observed variables in xarray or pandas.
- reduce_dims: Optionally specify which dimensions to reduce when
calculating MAE. All other dimensions will be preserved.
- preserve_dims: Optionally specify which dimensions to preserve
when calculating MAE. All other dimensions will be reduced.
As a special case, 'all' will allow all dimensions to be
preserved. In this case, the result will be in the same
shape/dimensionality as the forecast, and the errors will be
the absolute error at each point (i.e. single-value comparison
against observed), and the forecast and observed dimensions
must match precisely.
- weights: Not yet implemented. Allow weighted averaging (e.g. by
area, by latitude, by population, custom).
fcst (Union[xr.Dataset, xr.DataArray, pd.Dataframe, pd.Series]): Forecast
or predicted variables in xarray or pandas.
obs (Union[xr.Dataset, xr.DataArray, pd.Dataframe, pd.Series]): Observed
variables in xarray or pandas.
reduce_dims (Union[str, Iterable[str]]): Optionally specify which dimensions
to reduce when calculating MAE. All other dimensions will be preserved.
preserve_dims (Union[str, Iterable[str]]): Optionally specify which
dimensions to preserve when calculating MAE. All other dimensions
will be reduced. As a special case, 'all' will allow all dimensions
to be preserved. In this case, the result will be in the same
shape/dimensionality as the forecast, and the errors will be
the absolute error at each point (i.e. single-value comparison
against observed), and the forecast and observed dimensions
must match precisely.
weights: Not yet implemented. Allow weighted averaging (e.g. by
area, by latitude, by population, custom).
Returns:
- By default an xarray DataArray containing a single floating
point number representing the mean absolute error for the
Union[xr.Dataset, xr.DataArray, pd.Dataframe, pd.Series]: By default an xarray DataArray containing
a single floating point number representing the mean absolute error for the
supplied data. All dimensions will be reduced.
Alternatively, an xarray structure with dimensions preserved as
appropriate containing the score along reduced dimensions
Notes:
- Dimensional reduction is not supported for pandas and the user
should convert their data to xarray to formulate the call to the metric.
- At most one of reduce_dims and preserve_dims may be specified.
Specifying both will result in an exception.
A detailed explanation is on [Wikipedia](https://en.wikipedia.org/wiki/Mean_absolute_error)
Alternatively, an xarray structure with dimensions preserved as appropriate
containing the score along reduced dimensions
"""

error = fcst - obs
Expand Down
6 changes: 5 additions & 1 deletion src/scores/probability/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,8 @@
Import the functions from the implementations into the public API
"""

from .crps_impl import adjust_fcst_for_crps, crps_cdf, crps_cdf_brier_decomposition
from scores.probability.crps_impl import (
adjust_fcst_for_crps,
crps_cdf,
crps_cdf_brier_decomposition,
)
24 changes: 15 additions & 9 deletions src/scores/probability/checks.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
"""
This module contains methods which make assertions at runtime about the state of various data
This module contains methods which make assertions at runtime about the state of various data
structures and values
"""

Expand All @@ -8,24 +8,30 @@


def coords_increasing(da: xr.DataArray, dim: str):
"""
Returns True if coordinates along `dim` dimension of `da` are increasing,
False otherwise. No in-built raise if `dim` is not a dimension of `da`.
"""Checks if coordinates in a given DataArray are increasing.
Note: No in-built raise if `dim` is not a dimension of `da`.
Args:
da (xr.DataArray): Input data
dim (str): Dimension to check if increasing
Returns:
(bool): Returns True if coordinates along `dim` dimension of
`da` are increasing, False otherwise.
"""
result = (da[dim].diff(dim) > 0).all()
return result


def cdf_values_within_bounds(cdf: xr.DataArray) -> bool:
"""
Checks that 0 <= cdf <= 1. Ignores NaNs.
"""Checks that 0 <= cdf <= 1. Ignores NaNs.
Args:
cdf: array of CDF values
cdf (xr.DataArray): array of CDF values
Returns:
`True` if `cdf` values are all between 0 and 1 whenever values are not NaN,
or if all values are NaN; and `False` otherwise.
(bool): `True` if `cdf` values are all between 0 and 1 whenever values are not NaN,
or if all values are NaN; and `False` otherwise.
"""
return cdf.count() == 0 or ((cdf.min() >= 0) & (cdf.max() <= 1))

Expand Down
Loading

0 comments on commit 68875f1

Please sign in to comment.