Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding date based pruning for ESM #465

Merged

Conversation

blimlim
Copy link
Collaborator

@blimlim blimlim commented Jul 23, 2024

Closes #464

This pull requests adds support for date based restart pruning for ESM1.5, which was previously implemented for the MOM models. As ESM1.5 uses MOM as a submodel, and produces the same ocean date files ocean_solo.res in the restart directories, I've just copied the get_restart_datetime method from mom_mixin.py into access.py.

I've manually tested it with with different monthly and yearly restart frequencies. E.g. running in month long sections, with restart_freq: 3MS for a total of 10 month results in the following archive:

metadata.yaml  output000  output001  output002  output003  output004  output005  output006  output007  output008  output009  pbs_logs  restart000  restart003  restart006  restart009

Likewise running for 6 months with restart_freq: 2MS results in

metadata.yaml  output000  output001  output002  output003  output004  output005  restart000  restart002  restart004  restart005

Creating a fake group of restart directories with yearly gaps in their ocean_solo.res dates, and running the pruning with restart_freq: 7YS also looks like it works (note that restart010 was the first of the fake directories):

metadata.yaml  pbs_logs    restart017  restart031  restart045  restart059  restart073  restart087  restart101 
output102      restart010  restart024  restart038  restart052  restart066  restart080  restart094  restart102

The implementation and manual tests are both a bit ad-hoc, and so it would be great to get another opinion on whether what's here is sufficient or if its better to try and do it differently.

@pep8speaks
Copy link

pep8speaks commented Jul 23, 2024

Hello @blimlim! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

Line 234:24: W291 trailing whitespace

Line 94:8: E114 indentation is not a multiple of four (comment)
Line 96:24: W291 trailing whitespace
Line 98:10: W292 no newline at end of file

Line 232:73: W291 trailing whitespace
Line 235:41: E203 whitespace before ':'
Line 243:6: W292 no newline at end of file

Comment last updated at 2024-07-26 02:46:27 UTC

@blimlim blimlim requested a review from jo-basevi July 23, 2024 23:56
@jo-basevi
Copy link
Collaborator

Thanks for opening this PR and running some tests!

I think the testing is probably sufficient. There's integration-like tests for pruning logic using a couple years of restarts and using access-om2 model's get_restart_datetime method (which is similar logic wise) in payu/test/test_prune_restarts.py. There's tests for checking the datetime in ocean_solo.res files are parsed in test/models/test_mom_mixin.py.

Testing has been a bit tricky with setting up Experiment classes which is needed to initialise Model classes. Current tests or functions probably need a rethink to make it more "unit"-ish and include mocking, but that's definitely outside of the scope of this PR!

An potential option without creating new tests is to refactor so the access-om2 and access drivers to use the same function, e.g. could add a generic helper function to the mom.py file or similar?

def get_restart_datetime_using_mom_submodel(model, restart_path):
    """Given a model object and a restart path, check for
    a MOM sub-model and use it's restart files to find a cftime datetime 
   (for date-based restart pruning)"""
    for sub_model in model.expt.models:
        if sub_model.model_type == 'mom' :
            mom_restart_path = os.path.join(restart_path, sub_model.name)
            return sub_model.get_restart_datetime(mom_restart_path)

    raise NotImplementedError(
        f'Cannot find mom sub-model: {model.model_type} date-based '
        'restart pruning requires the mom sub-model to '
        'determine restart dates'
    )

Then in access/access-om2 drivers:

def get_restart_datetime(self, restart_path):
        """Given a restart path, parse the restart files and
        return a cftime datetime (for date-based restart pruning)"""
       # Use mom submodel to determine restart datetimes
        get_restart_datetime_using_mom_submodel(model=self, restart_path=restart_path)

Having a helper function outside of a Model driver class might make testing easier in future too, as can just create a mocks of the model class. (I'm now wondering if the mom mixin should instead just be helper functions too...)

That said, it's not a big issue and I'm sure the code you have added works, so I am happy to hit approve on this PR!

jo-basevi
jo-basevi previously approved these changes Jul 26, 2024
@blimlim
Copy link
Collaborator Author

blimlim commented Jul 26, 2024

Thanks for reviewing this! I think thats a great idea with the helper function, and it helps avoid copying the exact same code across the different models. I've had a go at copying that in and it still is working with the manual tests of ESM. Would you be happy to have a quick look through those changes? No rush though!

Copy link
Collaborator

@jo-basevi jo-basevi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Re-approving the changes

@blimlim blimlim merged commit a763ea1 into payu-org:master Jul 28, 2024
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Date based pruning for ESM1.5
3 participants