Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MAINT: Switch from gdal to rasterio #167

Merged
merged 2 commits into from
Apr 21, 2024

Conversation

inoelloc
Copy link
Member

@inoelloc inoelloc commented Apr 19, 2024

This commit addresses the problem of distributing smash with gdal. gdal doesn't provide binary wheels directly, so you need to install the gdal library on your system before you can install gdal bindings in Python. One way of working with gdal simply in Python is conda, but in order to facilitate the distribution of smash on different OS (linux, maxos and windows), it is simpler to use rasterio to read raster (rasterio provides a binary wheel directly which include gdal). This change is not neutral and leads to performance losses in terms of raster calculation time (rain, etp, descriptors, etc). Generally speaking, reading the input data is a relatively quick process compared with a calibration. However, if the calculation time for reading becomes too great a concern, it can be parallelized.

2 other minor changes have been made:

  • Date regular expression match only on the file name and not the entire path. This can cause problems, for example, if the path is something like this (/home/2020930920/data/prcp/rain_202001010000.tif)
  • Update tests to avoid reading rasters N times and instead copy arrays

NOTE: A future commit will be made to integrate meson to manage the smash build system. A number of changes will be made to the way smash is installed.

This commit addresses the problem of distributing smash with gdal. gdal doesn't provide binary wheels directly,
so you need to install the gdal library on your system before you can install gdal bindings in Python.
One way of working with gdal simply in Python is conda, but in order to facilitate the distribution of smash on
different OS (linux, maxos and windows), it is simpler to use rasterio to read raster (rasterio provides a binary
wheel directly which impacts gdal). This change is not neutral and leads to performance losses in terms of raster
calculation time (rain, etp, descriptors, etc). Generally speaking, reading the input data is a relatively quick
process compared with a calibration. However, if the calculation time for reading becomes too great a concern,
it can be parallelized.

2 other minor changes have been made:
- Date regular expression match only on the file name and not the entire path. This can cause problems, for example,
if the path is something like this (``/home/2020930920/data/prcp/rain_202001010000.tif``)
- Update tests to avoid reading rasters ``N`` times and instead copy arrays

NOTE: A future commit will be made to integrate meson to manage the smash build system. A number of changes will be
made to the way smash is installed.
@inoelloc inoelloc added the maintenance Maintenance label Apr 19, 2024
doc/source/release/1.0.0-notes.rst Outdated Show resolved Hide resolved
@inoelloc inoelloc merged commit c325f7d into maintenance/1.0.x Apr 21, 2024
18 checks passed
@inoelloc inoelloc deleted the maint-switch-from-gdal-to-rasterio branch April 21, 2024 14:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
maintenance Maintenance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants