-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Round data in staticmaps.nc #231
Comments
Hi @hboisgon, I saw an issue like this come up in my mentions earlier, for Wflow.jl: Deltares/Wflow.jl#314 Like I mention there: you generally don't want to round binary numbers. A float32 will always take 32 bits of memory, and a float64 will take 64 bits of memory. You might get smaller files if you turn compression on, and rounding might help a little since you are reducing the information content (so the compression algorithm will be able to find more redundancy), but you need to turn on compression in either case. But if you're looking to reduce file sizes, I recommend investigating compression instead. NetCDF4 only supports zlib compression; e.g. Zarr uses Blosc for far more performant compression. With regards to the physical interpretation: if you want to add that, you should probably try adding metadata instead. You could argue that a river width is never more accurate than 1 cm (for example), but doesn't generalize: e.g. if you're doing computational/numerical experiments. And in that case you should do error propagation proper! That's stuff like this: And then ideally support it in an xarray package like pint does: https://xarray.dev/blog/introducing-pint-xarray |
This may also relate to https://docs.xarray.dev/en/latest/user-guide/io.html#writing-encoded-data. I read online (pydata/xarray#865 and pydata/xarray#1572) that lossy compression is possible and may go hand in hand with with rounding the data, as accuracy is guaranteed for a certain number of digits, I guess similar to this: https://docs.unidata.ucar.edu/netcdf-c/current/md__media_psf_Home_Desktop_netcdf_releases_v4_9_2_release_netcdf_c_docs_quantize.html I am not sure how this would work with zlib, if it is either lossy vs lossless, or that a combination can be used? |
It looks a bit like a breadcrumbs trail to be honest, as xarray doesn't just provide an overview -- which is reasonable, since it depends on what's available in the netCDF4 / HDF5 binaries. The relevant netCDF4-python docs: https://unidata.github.io/netcdf4-python/#efficient-compression-of-netcdf-variables:
For hydromt, you can safely assume that the binary origin is conda-forge so whatever plugins are compiled there are relevant. More info is probably only available on the netCDF docs directly, among them quantizing: https://docs.unidata.ucar.edu/netcdf-c/current/md__media_psf_Home_Desktop_netcdf_releases_v4_9_2_release_netcdf_c_docs_quantize.html Pretty confident that zlib is lossless. Best approach IMO is to setup a new pixi env, see which schemes work, and make some examples. Would be useful documentation anyway! |
Kind of request
Changing existing functionality
Enhancement Description
We have the request for staticgeoms but I think it would be good practice to round all grids in staticmaps.nc.
The number of decimals does not make sense. Eg
Use case
It would produce less big file for staticmaps.nc. Not sure if it would have any impact on computation speed?
Additional Context
No response
The text was updated successfully, but these errors were encountered: