Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dimensionless Units #55

Draft
wants to merge 23 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
668c052
root: start commit of dimensionless units
pgierz Nov 8, 2024
c05ba93
wip: adds example dimensionless_mappings
pgierz Nov 8, 2024
b1ef7f2
fix(dimensionless_mappings): Inner key should be dict, not list
pgierz Nov 8, 2024
7900103
feat: allows rules to have access to dimensionless unit mapping table
pgierz Nov 8, 2024
9da4cf8
fix: removes custom cache function from pipeline, as it is still wip
pgierz Nov 8, 2024
367a1fc
fix: Update data/dimensionless_mappings.yaml
pgierz Nov 8, 2024
7d9d910
fixes the issue #54
siligam Nov 8, 2024
0d7f331
Update data/dimensionless_mappings.yaml
siligam Nov 9, 2024
9b0b05e
Update src/pymorize/units.py
siligam Nov 9, 2024
1c5b69b
doc(cmorizer.py): adds documentation for init method
pgierz Nov 11, 2024
e88b52a
added frequency check to validate method
siligam Nov 12, 2024
cc4f159
fix(cmorizer): better constructor for parallelization
pgierz Nov 11, 2024
78f82a6
fix(cmorizer): validate should not be called during object creation
pgierz Nov 12, 2024
c6c8a7f
test: add unit test for unitless PSU conversion
pgierz Nov 12, 2024
6fbc69f
Merge branch 'main' into feat/dimensionless-units
pgierz Nov 12, 2024
aa51c32
added units check to validate method in CMORizer
siligam Nov 12, 2024
aaa1f19
redefined psu
siligam Nov 12, 2024
0ba5fac
added missing import statement
siligam Nov 12, 2024
6bb99ff
re-write unit checker
siligam Nov 12, 2024
d066700
fixed typo
siligam Nov 12, 2024
64d94ae
updated unit_mapping and tests for unit psu
siligam Nov 12, 2024
4eb1478
reverted the dimsionless unit test case
siligam Nov 13, 2024
36fcefd
fix in cmorizer, rework on dimensionless mapping tests, and setting o…
mandresm Nov 15, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 27 additions & 2 deletions src/pymorize/cmorizer.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,9 +12,14 @@
from prefect_dask import DaskTaskRunner
from rich.progress import track

from .data_request import (DataRequest, DataRequestTable, DataRequestVariable,
IgnoreTableFiles)
from .data_request import (
DataRequest,
DataRequestTable,
DataRequestVariable,
IgnoreTableFiles,
)
from .filecache import fc
from .units import handle_unit_conversion
from .logging import logger
from .pipeline import Pipeline
from .rule import Rule
Expand Down Expand Up @@ -256,6 +261,7 @@ def validate(self):
# self._check_rules_for_table()
# self._check_rules_for_output_dir()
self._check_is_subperiod()
self._check_units()

def _check_is_subperiod(self):
logger.info("checking frequency in netcdf file and in table...")
Expand Down Expand Up @@ -288,6 +294,25 @@ def _check_is_subperiod(self):
if errors:
for err in errors:
logger.error(err)
raise errors[0]

def _check_units(self):
errors = []
for rule in self.rules:
for input_collection in rule.inputs:
try:
filename = input_collection.files[0]
except IndexError:
break
da = xr.open_dataarray(filename, use_cftime=True)
mandresm marked this conversation as resolved.
Show resolved Hide resolved
try:
handle_unit_conversion(da, rule)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a check. You just call the actual conversion function. That's supposed to happen in process. I think you instead should see if the variable has supplied dimensionless units, and if those are defined in a way that pint will be able to handle it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay. will rewrite it.

Copy link
Contributor

@siligam siligam Nov 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: how to know the unit in the netcdf file without accessing it. I can check the cmor_variable (table unit) alone from the rule object if it dimensionless and if the unit mapping is found in dimensionless_units but how about the source units. If model_variable is defined then I can check it as well but that means the user wants to use model_variable instead of the units found in the netcdf file.

consider the following table

var.name modelunit tableunit converted_representation
sidmasstranx kg s-1 m-1 kg s-1
sisnconc 1 % 100.0 percent
sitimefrac 1.0 1 1.0 dimensionless
so psu 0.001 0.001 gram / gram

If I just check the tableunit, then I can not catch sidmasstranx failing and for sitimefrac the check will fail as it is dimensionless and there is no entry in the mapping for it yet but if the user puts an mapping entry for this variable then it still fails as source units (from netcdf file or user defined modelunits) are still dimensionless and the unit conversion function does not consider using mapping for source units.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The re-written code only deals with dimensionless variables (i.e., ignoring sidmasstranx case) and addresses the rest of the cases.

except ValueError as e:
msg = f"Error while converting units for {filename}: {e}"
logger.error(msg)
errors.append(e)
if errors:
raise errors[0]

@classmethod
def from_dict(cls, data):
Expand Down
4 changes: 2 additions & 2 deletions tests/unit/test_dimensionless_units.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,6 @@ def test_units_with_psu():
new_da = handle_unit_conversion(da, rule)
assert new_da.attrs.get("units") == "0.001"
# Check the magnitude of the data
# 1 g/kg is approximately 1 psu, so the values should be 10
# 1 g/kg is approximately 1 psu, so the values should be 10 * 1000
# after conversion:
assert np.allclose(new_da.values, 10)
assert np.allclose(new_da.values, 10000)
Loading