diff --git a/doc/source/_static/lez.png b/doc/source/_static/lez.png new file mode 100644 index 00000000..e95486e1 Binary files /dev/null and b/doc/source/_static/lez.png differ diff --git a/doc/source/user_guide/index.rst b/doc/source/user_guide/index.rst index 41cb1e77..104e4174 100755 --- a/doc/source/user_guide/index.rst +++ b/doc/source/user_guide/index.rst @@ -11,3 +11,4 @@ Quickstart quickstart/cance_first_simulation quickstart/france_large_domain_simulation + quickstart/lez_split_sample_test diff --git a/doc/source/user_guide/quickstart/france_large_domain_simulation.rst b/doc/source/user_guide/quickstart/france_large_domain_simulation.rst index 57e4288b..3a53552b 100644 --- a/doc/source/user_guide/quickstart/france_large_domain_simulation.rst +++ b/doc/source/user_guide/quickstart/france_large_domain_simulation.rst @@ -96,7 +96,7 @@ simulation period is different. Model mesh creation ******************* -For the meshing, we only need the flow direction file and the mainland France bouding box ``bbox`` to pass to the `smash.factory.generate_mesh` +For the ``mesh``, we only need the flow direction file and the mainland France bouding box ``bbox`` to pass to the `smash.factory.generate_mesh` function. A bouding box in `smash` is a list of 4 values (``xmin``, ``xmax``, ``ymin``, ``ymax``), each of which corresponds respectively to the x minimum value, the x maximum value, the y mimimum value and the y maximum value. The values must be in the same unit and projection as the flow direction. diff --git a/doc/source/user_guide/quickstart/lez_split_sample_test.rst b/doc/source/user_guide/quickstart/lez_split_sample_test.rst new file mode 100644 index 00000000..a843a302 --- /dev/null +++ b/doc/source/user_guide/quickstart/lez_split_sample_test.rst @@ -0,0 +1,348 @@ +.. _user_guide.quickstart.lez_split_sample_test: + +======================= +Lez - Split Sample Test +======================= + +This third and last quickstart tutorial on `smash` will be carried out on another French catchment, **the Lez at Lattes**. The objective is to +set up a split sample test :cite:p:`klemevs1986operational` over two distinct periods ``p1`` and ``p2``. + +.. image:: ../../_static/lez.png + :width: 400 + :align: center + +You need first to download all the required data. + +.. button-link:: https://smash.recover.inrae.fr/dataset/Lez-dataset.tar + :color: primary + :shadow: + :align: center + + **Download** + +If the download was successful, a file named ``Lez-dataset.tar`` should be available. We can switch to the directory where this file has been +downloaded and extract it using the following command: + +.. code-block:: shell + + tar xf Lez-dataset.tar + +Now a folder called ``Lez-dataset`` should be accessible and contain the following files and folders: + +- ``France_flwdir.tif`` + A GeoTiff file containing the flow direction data, +- ``gauge_attributes.csv`` + A csv file containing the gauge attributes (gauge coordinates, drained area and code), +- ``prcp`` + A directory containing precipitation data in GeoTiff format with the following directory structure: ``%Y/%m`` + (``2012/08``), +- ``pet`` + A directory containing daily interannual potential evapotranspiration data in GeoTiff format, +- ``qobs`` + A directory containing the observed discharge data in csv format, +- ``descriptor`` + A directory containing physiographic descriptors in GeoTiff format. + +We can open a Python interface in the **conda environment**. The current working directory will be assumed to be the directory where +the ``Lez-dataset`` is located. + +.. code-block:: shell + + (smash) python3 + +.. ipython:: python + :suppress: + + import os + os.system("python3 gen_dataset.py -d Lez") + +Imports +------- + +We will first import everything we need in this tutorial. + +.. ipython:: python + + import smash + import numpy as np + import pandas as pd + import matplotlib.pyplot as plt + +Model creation +-------------- + +Model setup creation +******************** + +Since we are going to work on two different periods, each of 6 months, we need to create two ``setups`` dictionaries where the only difference +will be in the simulation period arguments ``start_time`` and ``end_time``. The first period ``p1`` will run from ``2012-08-01`` to +``2013-01-31`` and the second, from ``2013-02-01`` to ``2013-07-31``. + +.. ipython:: python + + setup_p1 = { + "start_time": "2012-08-01", + "end_time": "2013-01-31", + "dt": 86_400, # daily time step + "hydrological_module": "gr4", + "routing_module": "lr", + "read_prcp": True, + "prcp_directory": "./Lez-dataset/prcp", + "read_pet": True, + "pet_directory": "./Lez-dataset/pet", + "read_qobs": True, + "qobs_directory": "./Lez-dataset/qobs" + } + + setup_p2 = { + "start_time": "2013-02-01", + "end_time": "2013-07-31", + "dt": 86_400, # daily time step + "hydrological_module": "gr4", + "routing_module": "lr", + "read_prcp": True, + "prcp_directory": "./Lez-dataset/prcp", + "read_pet": True, + "pet_directory": "./Lez-dataset/pet", + "read_qobs": True, + "qobs_directory": "./Lez-dataset/qobs" + } + +Model mesh creation +******************* + +For the ``mesh``, there is no need to generate two different ``meshes`` dictionaries, the same one can be used for both periods. Similar to the +**Cance** tutorial, we are going to create a ``mesh`` from the attributes of the catchment gauges. + +.. ipython:: python + + gauge_attributes = pd.read_csv("./Lez-dataset/gauge_attributes.csv") + + mesh = smash.factory.generate_mesh( + flwdir_path="./Lez-dataset/France_flwdir.tif", + x=list(gauge_attributes["x"]), + y=list(gauge_attributes["y"]), + area=list(gauge_attributes["area"] * 1e6), # Convert km² to m² + code=list(gauge_attributes["code"]), + ) + +And quickly see that the generated ``mesh`` is correct + +.. ipython:: python + + plt.imshow(mesh["flwdst"]); + plt.colorbar(label="Flow distance (m)"); + @savefig user_guide.quickstart.lez_split_sample_test.flwdst.png + plt.title("Lez - Flow distance"); + +.. ipython:: python + + plt.imshow(mesh["flwacc"]); + plt.colorbar(label="Flow accumulation (m²)"); + @savefig user_guide.quickstart.lez_split_sample_test.flwacc.png + plt.title("Lez - Flow accumulation"); + +.. ipython:: python + + base = np.zeros(shape=(mesh["nrow"], mesh["ncol"])) + base = np.where(mesh["active_cell"] == 0, np.nan, base) + for pos in mesh["gauge_pos"]: + base[pos[0], pos[1]] = 1 + plt.imshow(base, cmap="Set1_r"); + @savefig user_guide.quickstart.lez_split_sample_test.gauge_position.png + plt.title("Lez - Gauge position"); + +Then, we can initialize the two `smash.Model` objects + +.. ipython:: python + + model_p1 = smash.Model(setup_p1, mesh) + model_p2 = smash.Model(setup_p2, mesh) + +Model simulation +---------------- + +Optimization +************ + +First, we will optimize both models for each period to generate two sets of optimized rainfall-runoff parameters. +So far, to optimize, we have called the method associated with the `smash.Model` object `Model.optimize `. This method +will modify the associated object in place (i.e. the values of the rainfall-runoff parameters after calling this function are modified). Here, we +want to optimize the model but still keep this model to run the validation afterwards. To do this, instead of calling the method +`Model.optimize ` method, we can call the `smash.optimize` function, which is identical but takes a +`smash.Model` object as input and returns a copy of it. This method allows you to optimize a `smash.Model` object and store the results in +another object without modifying the initial one. + +Similar to the **Cance** tutorial, we will perform a simple spatially uniform optimization of the rainfall-runoff parameters by minimizing the +Nash-Sutcliffe efficiency on the most downstream gauge. + +.. To speed up documentation generation +.. ipython:: python + :suppress: + + ncpu = min(5, max(1, os.cpu_count() - 1)) + model_p1_opt = smash.optimize(model_p1, common_options={"ncpu": ncpu}) + model_p2_opt = smash.optimize(model_p2, common_options={"ncpu": ncpu}) + +.. ipython:: python + :verbatim: + + model_p1_opt = smash.optimize(model_p1) + model_p2_opt = smash.optimize(model_p2) + +We can take a look at the hydrographs and optimized rainfall-runoff parameters. + +.. ipython:: python + + code = model_p1.mesh.code[0] + + f, (ax1, ax2) = plt.subplots(1, 2, figsize=(11, 4)); + + qobs = model_p1_opt.response_data.q[0,:].copy() + qobs = np.where(qobs < 0, np.nan, qobs) # To deal with missing values + qsim = model_p1_opt.response.q[0,:] + ax1.plot(qobs); + ax1.plot(qsim); + ax1.grid(ls="--", alpha=.7); + ax1.set_xlabel("Time step"); + ax1.set_ylabel("Discharge ($m^3/s$)"); + + qobs = model_p2_opt.response_data.q[0,:].copy() + qobs = np.where(qobs < 0, np.nan, qobs) # To deal with missing values + qsim = model_p2_opt.response.q[0,:] + ax2.plot(qobs, label="Observed discharge"); + ax2.plot(qsim, label="Simulated discharge"); + ax2.grid(ls="--", alpha=.7); + ax2.set_xlabel("Time step"); + ax2.legend(); + + @savefig user_guide.quickstart.lez_split_sample_test.optimize_q.png + f.suptitle( + f"Observed and simulated discharge at gauge {code}" + " for period p1 (left) and p2 (right)\nCalibration" + ); + +.. ipython:: python + + ind = tuple(model_p1.mesh.gauge_pos[0, :]) + + opt_parameters_p1 = { + k: model_p1_opt.get_rr_parameters(k)[ind] for k in ["cp", "ct", "kexc", "llr"] + } # A dictionary comprehension + + opt_parameters_p2 = { + k: model_p2_opt.get_rr_parameters(k)[ind] for k in ["cp", "ct", "kexc", "llr"] + } # A dictionary comprehension + + opt_parameters_p1 + opt_parameters_p2 + +Temporal validation +******************* + +Rainfall-runoff parameters transfer +''''''''''''''''''''''''''''''''''' + +We can now transfer the optimized rainfall-runoff parameters from each period to their respective validation period. +We will transfer the rainfall-runoff parameters from ``model_p1_opt`` to ``model_p2`` and from ``model_p2_opt`` to ``model_p1``. +There are several ways to do this: + +- Transfer all rainfall-runoff parameters at once + All rainfall-runoff parameters are stored in the variable ``values`` of the object `Model.rr_parameters `. + We can therefore pass the whole array of rainfall-runoff parameters from one object to the other. + + .. ipython:: python + + model_p1.rr_parameters.values = model_p2_opt.rr_parameters.values.copy() + model_p2.rr_parameters.values = model_p1_opt.rr_parameters.values.copy() + + .. note:: + A deep copy is recommended to avoid that the rainfall-runoff parameters between each object become shallow copies and + so that the modification of one of the arrays leads to the modification of another. + +- Transfer each rainfall-runoff parameter one by one + It is also possible to loop on each rainfall-runoff parameter and assign new rainfall-runoff parameter by passing + by getters and setters + + .. ipython:: python + + for key in model_p1.rr_parameters.keys: + model_p1.set_rr_parameters(key, model_p2_opt.get_rr_parameters(key)) + model_p2.set_rr_parameters(key, model_p1_opt.get_rr_parameters(key)) + + .. note:: + This method allows, instead of looping on all rainfall-runoff parameters, to loop only on some. We can replace + ``model_p1.rr_parameters.keys`` by ``["cp", "ct"]`` for example + +Forward run +''''''''''' + +Once the rainfall-runoff parameters have been transferred, we can proceed with the validation forward runs by calling the +`Model.forward_run ` method. + +.. ipython:: python + + model_p1.forward_run() + model_p2.forward_run() + +and visualize hydrographs + +.. ipython:: python + + code = model_p1.mesh.code[0] + + f, (ax1, ax2) = plt.subplots(1, 2, figsize=(11, 4)); + + qobs = model_p1.response_data.q[0,:].copy() + qobs = np.where(qobs < 0, np.nan, qobs) # To deal with missing values + qsim = model_p1.response.q[0,:] + ax1.plot(qobs); + ax1.plot(qsim); + ax1.grid(ls="--", alpha=.7); + ax1.set_xlabel("Time step"); + ax1.set_ylabel("Discharge ($m^3/s$)"); + + qobs = model_p2.response_data.q[0,:].copy() + qobs = np.where(qobs < 0, np.nan, qobs) # To deal with missing values + qsim = model_p2.response.q[0,:] + ax2.plot(qobs, label="Observed discharge"); + ax2.plot(qsim, label="Simulated discharge"); + ax2.grid(ls="--", alpha=.7); + ax2.set_xlabel("Time step"); + ax2.legend(); + + @savefig user_guide.quickstart.lez_split_sample_test.forward_run_q.png + f.suptitle( + f"Observed and simulated discharge at gauge {code}" + " for period p1 (left) and p2 (right)\nValidation" + ); + +Model performances +------------------ + +Finally, we can look at calibration and validation performances using certain metrics. Using the function `smash.metrics`, +you can compute one metric of your choice (among those available) for all the gauges that make up the ``mesh``. Here, we are interested +in the ``nse`` (the calibration metric) and the ``kge`` for the downstream gauge only. We will create two `pandas.DataFrame`, one for the +calibration performances and the other for the validation performances. + +.. ipython:: python + + metrics = ["nse", "kge"] + perf_cal = pd.DataFrame(index=["p1", "p2"], columns=metrics) + perf_val = perf_cal.copy() + + for m in metrics: + perf_cal.loc["p1", m] = np.round(smash.metrics(model_p1_opt, metric=m)[0], 2) + + for m in metrics: + perf_cal.loc["p2", m] = np.round(smash.metrics(model_p2_opt, metric=m)[0], 2) + + for m in metrics: + perf_val.loc["p1", m] = np.round(smash.metrics(model_p1, metric=m)[0], 2) + + for m in metrics: + perf_val.loc["p2", m] = np.round(smash.metrics(model_p2, metric=m)[0], 2) + + perf_cal # Calibration performances + + perf_val # Validation performances \ No newline at end of file