Review of software design choices in benchcab

Implemented as an external package

Positives:

Prevents model developers from improperly amending the test suite - prevents bespoke testing in long lived development branches.
Allows for testing multiple branches of CABLE which have diverged.
Potential for extending the testing framework to other models including the coupled model.
Can be used for testing changes to model configurations as well as source code changes.

Negatives:

Being an external package means benchcab infrastructure can become out of sync with the main CABLE repository. Significant changes to model configurations require separate updates to benchcab.

Alternatives approaches:

The CABLE benchcab tests could be maintained in the CABLE repository (similar to the design of pytest unit tests).

Packaged with conda

Positives:

Deployed to hh5 conda modules on Gadi

Positives:

Negatives:

Other packages in environment can break functionality, unlikely to be able to pin dependencies.

Moving to a minimal conda environment would solve the negatives but impact the access.

Implemented in python

Positives:

Negatives:

Initially designed to test only CABLE

Positives:

Negatives:

Design is currently not model agnostic and will take work to port the system to other models.

Initially designed to function only on Gadi

This was due to the lack of support for building CABLE outside of Gadi in the early stages of development.

Positives:

Negatives:

Design is currently not portable and will take work to port the system to other machines. For example, dependence on modules, environment variables, system git or svn, system PBS, payu (only works on Gadi).
Results in complex unit tests which require mocking system dependencies.
Unable to be used inside a continuous integration pipeline.

Modular approach to model evaluation via modelevaluation.org

Positives:

Modularity: users working outside of Gadi can also benefit from the analysis scripts available on meorg without needing benchcab generated output.
Can leverage expertise in land surface model evaluation by using meorg.

Negatives:

Requires development of a web API around meorg (currently in progress) for benchcab to programatically upload model output to the service.
Issues relating to meorg are harder to fix as it managed by developers external to ACCESS-NRI.

Science configurations are based off a centralised namelist file

Negatives:

Difficult to run configurations with substantial differences compared to the centralised namelist configuration.
Benchcab configurations are difficult to run standalone (see this issue).

Provide feedback