Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: Consider adding vertical regridding feature to reconstruct pressure from hybrid #446

Open
tomvothecoder opened this issue Apr 6, 2023 · 13 comments
Assignees
Labels

Comments

@tomvothecoder
Copy link
Collaborator

tomvothecoder commented Apr 6, 2023

Is your feature request related to a problem?

Related to #45 and #388.

Discussed in #440

Originally posted by tomvothecoder March 27, 2023
Hi @xCDAT/core-developers or anybody in the xCDAT community. Have any of you come across a package that implements an xarray-based function similar to cdutil.vertical.reconstructPressureFromHybrid? I am working on refactoring e3sm_diags and e3sm_to_cmip to use xarray/xCDAT and there are references to this function that I need to replace. Thanks!

Docstring:

def reconstructPressureFromHybrid(ps, A, B, Po):
    """
    Reconstruct the Pressure field on sigma levels, from the surface pressure
    :param Ps: Surface pressure
    :param A: Hybrid Conversion Coefficient, such as: p=B.ps+A.Po.
    :param B: Hybrid Conversion Coefficient, such as: p=B.ps+A.Po.
    :param Po: Hybrid Conversion Coefficient, such as: p=B.ps+A.Po
    :param Ps: surface pressure
    .. note::
        A and B are 1d sigma levels.
        Po and Ps must have same units.
    :returns: Pressure field, such as P=B*Ps+A*Po.
    :Example:
        .. doctest:: vertical_reconstructPressureFromHybrid
            >>> P=reconstructPressureFromHybrid(ps,A,B,Po)
    """
	...

Examples Usages:

Describe the solution you'd like

First, check of any other xarray-based package includes this feature.

Describe alternatives you've considered

No response

Additional context

No response

@tomvothecoder tomvothecoder added the type: enhancement New enhancement request label Apr 6, 2023
@tomvothecoder
Copy link
Collaborator Author

Hey @jasonb5, can you follow the "Describe the solution you'd like" and see what we should do?
If no API exists in another xarray-based package, I'm hoping to have this available by the next release around end of May.

e3sm_to_cmip and e3sm_diags are being refactored to use xarray/xcdat and both use cdutil.vertical.reconstructPressureFromHybrid() which blocks progress for these efforts.

@jasonb5
Copy link
Collaborator

jasonb5 commented Apr 7, 2023

@tomvothecoder I'll check this out and see if I can find anything existing otherwise I'll go off the original plan I had for this in #388.

@tomvothecoder
Copy link
Collaborator Author

Thanks @jasonb5!

@dcherian
Copy link

dcherian commented Apr 15, 2023

I would happily merge this in cf-xarray. We already support a few ocean parametric coordinates. The code would go here https://github.com/xarray-contrib/cf-xarray/blob/c36a3687b3820176c394580f86a67ca53b9cadbd/cf_xarray/accessor.py#L2507

@tomvothecoder
Copy link
Collaborator Author

tomvothecoder commented Apr 17, 2023

Hey @dcherian, yes @jasonb5 came across that feature in cf-xarray! He mentioned it might be a good opportunity to make a PR to cf-xarray to support decoding other parametric vertical coordinates.

Ideally, we'd like to have this feature out relatively soon so we can start refactoring some other packages with it.
If cf-xarray has a quick turnaround time for releases, we can definitely contribute with a PR. Alternatively, we can implement this feature in xCDAT in the interim and then port it over to cf-xarray at a later date.

@dcherian
Copy link

If cf-xarray has a quick turnaround time for releases

Yes, at the moment I just release when there's some nice new feature.

@pochedls
Copy link
Collaborator

pochedls commented Aug 3, 2023

In testing xcdat v0.6.0, I wanted to be able to create the pressure coordinate from hybrid (generically across a bunch of models). I ended up creating the following function, which takes the embedded formula (e.g., ds.lev.formula) and converts it into a statement that can be executed by xarray (e.g., interprete_formula(ds.lev.formula) yields p = ds['a']*ds['p0']+ds['b']*ds['ps'] where ds.lev.formula = p = a*p0 + b*ps). So you can then do:

f = ds.plev.formula
fstring = interprete_formula(f)
exec(f)
print(p)

<xarray.DataArray (lev: 91, time: 1980, lat: 128, lon: 256)>
...
Coordinates:

  • lev (lev) float64 0.9988 0.9959 0.992 ... 5.61e-05 2.951e-05 9.869e-06
  • lat (lat) float64 -88.93 -87.54 -86.14 -84.74 ... 86.14 87.54 88.93
  • lon (lon) float64 0.0 1.406 2.812 4.219 ... 354.4 355.8 357.2 358.6
  • time (time) object 1850-01-16 12:00:00 ... 2014-12-16 12:00:00

This is a clunky function (though it so far appears to be working on CMIP data...). I am absolutely sure you developers can come up with a better method for interpreting the hybrid formula on the fly, but I thought I'd put it here in case it is useful. I also recognize using exec/eval is not a best practice...maybe this can be avoided somehow (I have ideas if anyone ends up working on this).

def interprete_formula(f):
    operator_chars = ['*', '(', ')', '+', '-', '/']
    # first find/remove all parenthesis that are specifying axes
    # p(n,k,j,i) = ap(k) + b(k)*ps(n,j,i) --> p = ap + b*ps
    # get formula indices corresponding to a left hand parenthesis (
    pinds = [i for i in range(len(f)) if f[i] == '(']
    # initialize list to replace sections of formula (e.g., '(i,j,k)' --> '')
    replace = []
    # loop over left hand parenthesis instances
    for p in pinds:
        pref = ''  # initialize parenthesis string
        stopsearch = False  # set stop flag to false
        # loop from left hand parentheis to end of formula
        for i in range(p, len(f)):
            # if stopsearch, continue
            if stopsearch:
                continue
            # get indexed character
            c = f[i]
            # check for close parenthesis )
            # if closed, stopsearch
            if c == ')':
                pref += c # add character to parenthesis string
                stopsearch = True
            else:
                pref += c # add character to parenthesis string
        # check for arithmetic operators inside parenthesis string
        ocs = [c for c in pref[1:-1] if c in operator_chars]
        # if no arithmetic operators, queue parenthesis string for removal
        if len(ocs) == 0:
            replace.append(pref)
    # remove all parenthesis strings with no arithmetic operators
    for r in replace:
        f = f.replace(r, '')

    # further process formula
    fout = ''  # initialize blank formula
    vid = ''  # initialize blank variable
    lhs = f.split('=')[0] # get lhs
    f = f.split('=')[1] # get rhs
    # loop over each char
    for s in f:
        # ignore blank spaces
        if s == ' ':
            continue
        # if arithmetic operator, need to add to formula
        if s in operator_chars:
            # if a variable is not empty, write it out first
            if len(vid) > 0:
                fout = fout + 'ds["' + vid + '"]'
                vid = ''  # reset variable accumulator
            # write out arithmetic operator
            fout = fout + s
        else:
            # char must be part of a variable - add to variable accumulator
            vid += s
    # finished looping over all characters
    # write out variable to formula
    fout = fout + 'ds["' + vid + '"]'
    # add lhs
    fout = lhs + ' = ' + fout
    # return formula
    return fout

@jasonb5
Copy link
Collaborator

jasonb5 commented Jul 18, 2024

@pochedls @tomvothecoder Created PR in the cf_xarray repo xarray-contrib/cf-xarray#528.

@pochedls
Copy link
Collaborator

I think I have this started correctly. @jasonb5 – would decode_vertical_coords have worked for atmosphere_hybrid_sigma_pressure_coordinate before this PR (it is working now, which I assume is because of your PR)?

I pulled your PR (in my normal xcdat environment) import cf-xarray and cf.__path__ shows that I'm pointing to your PR). I then import xcdat (which I believe retains the updated cf-xarray import...).

Syntax for testers (i.e., note to self):

fn = '/p/css03/esgf_publish/CMIP6/CMIP/NASA-GISS/GISS-E3-G/amip/r1i1p3f1/CFmon/ta/gr/v20220707/ta_CFmon_GISS-E3-G_amip_r1i1p3f1_gr_197801-201412.nc'
ds = xc.open_mfdataset(fn)
ds.cf.decode_vertical_coords(outnames={'lev': 'plev'})
ds.plev

@pochedls
Copy link
Collaborator

Summary

I made some progress on this. The main finding is that models with the formula p = ap + b*ps fail to decode (sometimes with an error and sometimes silently). I think this results because under CF conventions, vertical coordinates with the standard_name : atmosphere_hybrid_sigma_pressure_coordinate have two acceptable formulas (with different fields needed): p = ap + b*ps and p = a*p0 + b*ps.

I've broken the problem into two main chunks ("Decode Error" and "Silent Failures"):

Decode Error

The following models produce an error (Required terms p0 are absent in the dataset.): IPSL* / Can* / MPI* / EC-Earth3-AerChem. These models have standard_name : atmosphere_hybrid_sigma_pressure_coordinate with formula p = ap + b*ps.

UKESM* and HadGEM3-GC31-LL produce KeyError: 'standard_name' and have standard_name : atmosphere_hybrid_height_coordinate with formula z = a + b*orog.

Silent failures

A number of models silently fail (i.e., ds.cf.decode_vertical_coords(outnames={'lev': 'plev'}) executes, but does not produce a plev DataArray): CNRM-CM6-1, CNRM-ESM2-1, GFDL-CM4, IPSL-CM6A-MR1. These datasets have standard_name : atmosphere_hybrid_sigma_pressure_coordinate and formula p = ap + b*ps. They don't have p0 in the dataset, but for some reason don't produce the error as above.

CESM2-WACCM and CESM2 also silently fail with standard_name : alevel (and no listed formula).

IPSL-CM6A-LR also silently fails. It has vertical coordinate presnivs with standard name : 'Vertical levels' and no formula.

Dataset Listings

Datasets with no apparent problem

CNRM-CM5: /p/css03/cmip5_css01/data/cmip5/output1/CNRM-CERFACS/CNRM-CM5/amip/mon/atmos/cfMon/r1i1p1/v20111006/ta/
bcc-csm1-1-m: /p/css03/cmip5_css01/data/cmip5/output1/BCC/bcc-csm1-1-m/amip/mon/atmos/cfMon/r1i1p1/v20120910/ta/
GFDL-CM3: /p/css03/esgf_publish/cmip5/output1/NOAA-GFDL/GFDL-CM3/amip/mon/atmos/cfMon/r1i1p1/v20110601/ta/
CESM1-CAM5: /p/css03/cmip5_css01/data/cmip5/output1/NSF-DOE-NCAR/CESM1-CAM5/amip/mon/atmos/cfMon/r2i1p1/v20121129/ta/
HadGEM2-A: /p/css03/cmip5_css02/data/cmip5/output1/MOHC/HadGEM2-A/amip/mon/atmos/cfMon/r1i1p1/ta/1/
inmcm4: /p/css03/cmip5_css02/data/cmip5/output1/INM/inmcm4/amip/mon/atmos/cfMon/r1i1p1/ta/1/
MIROC5: /p/css03/cmip5_css02/data/cmip5/output1/MIROC/MIROC5/amip/mon/atmos/cfMon/r1i1p1/ta/1/
MRI-CGCM3: /p/css03/esgf_publish/cmip5/output1/MRI/MRI-CGCM3/amip/mon/atmos/cfMon/r1i1p1/v20131011/ta/
CCSM4: /p/css03/esgf_publish/cmip5/gdo2/cmip5/output1/NCAR/CCSM4/amip/mon/atmos/cfMon/r7i1p1/v20121031/ta/
INM-CM5-0: /p/css03/esgf_publish/CMIP6/CMIP/INM/INM-CM5-0/amip/r1i1p1f1/CFmon/ta/gr1/v20190610/
INM-CM4-8: /p/css03/esgf_publish/CMIP6/CMIP/INM/INM-CM4-8/amip/r1i1p1f1/CFmon/ta/gr1/v20190528/
TaiESM1: /p/css03/esgf_publish/CMIP6/CMIP/AS-RCEC/TaiESM1/amip/r1i1p1f1/CFmon/ta/gn/v20200817/
MRI-ESM2-0: /p/css03/esgf_publish/CMIP6/CMIP/MRI/MRI-ESM2-0/amip/r2i1p1f1/CFmon/ta/gn/v20190722/
MIROC-ES2L: /p/css03/esgf_publish/CMIP6/CMIP/MIROC/MIROC-ES2L/amip/r3i1p1f2/CFmon/ta/gn/v20200903/
MIROC6: /p/css03/esgf_publish/CMIP6/CMIP/MIROC/MIROC6/amip/r2i1p1f1/CFmon/ta/gn/v20190311/
NorESM2-LM: /p/css03/esgf_publish/CMIP6/CMIP/NCC/NorESM2-LM/amip/r1i1p2f1/CFmon/ta/gn/v20210319/
BCC-CSM2-MR: /p/css03/esgf_publish/CMIP6/CFMIP/BCC/BCC-CSM2-MR/amip/r1i1p1f1/CFmon/ta/gn/v20190801/

Dataset with an error on ds.cf.decode_vertical_coords(outnames={'lev': 'plev'})

IPSL-CM5B-LR: /p/css03/cmip5_css01/data/cmip5/output1/IPSL/IPSL-CM5B-LR/amip/mon/atmos/cfMon/r1i1p1/v20120526/ta/
IPSL-CM5A-MR: /p/css03/cmip5_css01/data/cmip5/output1/IPSL/IPSL-CM5A-MR/amip/mon/atmos/cfMon/r1i1p1/v20120804/ta/
IPSL-CM5A-LR: /p/css03/cmip5_css01/data/cmip5/output1/IPSL/IPSL-CM5A-LR/amip/mon/atmos/cfMon/r1i1p1/v20111119/ta/
MPI-ESM-LR: /p/css03/esgf_publish/cmip5/output1/MPI-M/MPI-ESM-LR/amip/mon/atmos/cfMon/r1i1p1/v20120215/ta/
CanAM4: /p/css03/cmip5_css02/data/cmip5/output1/CCCma/CanAM4/amip/mon/atmos/cfMon/r1i1p1/ta/1/
UKESM1-0-LL: /p/css03/esgf_publish/CMIP6/CMIP/MOHC/UKESM1-0-LL/amip/r1i1p1f4/CFmon/ta/gn/v20210817/
HadGEM3-GC31-LL: /p/css03/esgf_publish/CMIP6/CMIP/MOHC/HadGEM3-GC31-LL/amip/r5i1p1f3/CFmon/ta/gn/v20210427/
CanESM5: /p/css03/esgf_publish/CMIP6/CMIP/CCCma/CanESM5/amip/r2i1p1f1/CFmon/ta/gn/v20190429/
MPI-ESM-1-2-HAM: /p/css03/esgf_publish/CMIP6/CMIP/HAMMOZ-Consortium/MPI-ESM-1-2-HAM/amip/r2i1p1f1/CFmon/ta/gn/v20190628/
EC-Earth3-AerChem: /p/css03/esgf_publish/CMIP6/CMIP/EC-Earth-Consortium/EC-Earth3-AerChem/amip/r1i1p1f1/CFmon/ta/gr/v20200910/
MPI-ESM1-2-LR: /p/css03/esgf_publish/CMIP6/CMIP/MPI-M/MPI-ESM1-2-LR/amip/r2i1p1f1/CFmon/ta/gn/v20190815/
MPI-ESM1-2-HR: /p/css03/esgf_publish/CMIP6/CMIP/MPI-M/MPI-ESM1-2-HR/amip/r2i1p1f1/CFmon/ta/gn/v20190710/

Datasets with no error, but do not produce a plev field

CNRM-CM6-1: /p/css03/esgf_publish/CMIP6/CMIP/CNRM-CERFACS/CNRM-CM6-1/amip/r1i1p1f2/CFmon/ta/gr/v20181203/
CNRM-ESM2-1: /p/css03/esgf_publish/CMIP6/CMIP/CNRM-CERFACS/CNRM-ESM2-1/amip/r1i1p1f2/CFmon/ta/gr/v20181205/
CESM2-WACCM: /p/css03/esgf_publish/CMIP6/CMIP/NCAR/CESM2-WACCM/amip/r2i1p1f1/CFmon/ta/gn/v20190220/
CESM2: /p/css03/esgf_publish/CMIP6/CMIP/NCAR/CESM2/amip/r1i1p1f1/CFmon/ta/gn/v20190218/
GFDL-CM4: /p/css03/esgf_publish/CMIP6/CMIP/NOAA-GFDL/GFDL-CM4/amip/r1i1p1f1/CFmon/ta/gr1/v20180701/
IPSL-CM6A-MR1: /p/css03/esgf_publish/CMIP6/CMIP/IPSL/IPSL-CM6A-MR1/amip/r1i1p1f1/CFmon/ta/gr/v20230220/
IPSL-CM6A-LR: /p/css03/esgf_publish/CMIP6/CMIP/IPSL/IPSL-CM6A-LR/amip/r2i1p1f1/CFmon/ta/gr/v20181109/

@dcherian
Copy link

Now that is amazing.

Given

The choice of whether a(k) or ap(k) is used depends on model formulation; the former is a dimensionless fraction, the latter a pressure value.

perhaps we can check units to decide which representation to use.

@pochedls
Copy link
Collaborator

FYI – I had focused on testing atmospheric vertical coordinates. Ideally we could get some help testing ocean coordinates (@durack1?); otherwise, this might take awhile to stress test. Although I think CMIP data I've used is mostly on depth coordinates: are there CMIP ocean variables that are typically on coordinate systems that would need to be "decoded" (ocean sigma, s-coordinate, etc.)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: In Progress
Development

No branches or pull requests

4 participants