Merge branch 'main' into fix-pip-ci

calliope-project · Sep 30, 2024 · 6cd5d2f · 6cd5d2f
2 parents 17a7168 + e447d17
commit 6cd5d2f
Show file tree

Hide file tree

Showing 61 changed files with 483 additions and 483 deletions.
diff --git a/.github/workflows/commit-ci.yml b/.github/workflows/commit-ci.yml
@@ -27,7 +27,7 @@ jobs:
 
     - uses: mamba-org/setup-micromamba@v1
       with:
-        micromamba-version: latest
+        micromamba-version: '1.5.10-0'
         environment-name: ${{ github.event.repository.name }}-ubuntu-latest-312-${{ hashFiles('requirements/dev.txt') }}
         environment-file: requirements/base.txt
         create-args: >-

diff --git a/.github/workflows/pr-ci.yml b/.github/workflows/pr-ci.yml
@@ -45,7 +45,7 @@ jobs:
 
     - uses: mamba-org/setup-micromamba@v1
       with:
-        micromamba-version: latest
+        micromamba-version: '1.5.10-0'
         environment-name: ${{ github.event.repository.name }}-${{ matrix.os }}-3${{ matrix.py3version }}-${{ hashFiles('requirements/dev.txt') }}
         environment-file: requirements/base.txt
         create-args: >-
@@ -108,7 +108,7 @@ jobs:
       - uses: actions/checkout@v4
       - uses: mamba-org/setup-micromamba@v1
         with:
-          micromamba-version: latest
+          micromamba-version: '1.5.10-0'
           environment-name: pipbuild
           create-args: >-
             python=3.11

diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -1,5 +1,10 @@
 ## 0.7.0.dev5 (Unreleased)
 
+### User-facing changes
+
+|changed| `data_sources` -> `data_tables` and `data_sources.source` -> `data_tables.data`.
+This change has occurred to avoid confusion between data "sources" and model energy "sources" (#673).
+
 ## 0.7.0.dev4 (2024-09-10)
 
 ### User-facing changes

diff --git a/docs/creating/data_sources.md → docs/creating/data_tables.md b/docs/creating/data_sources.md → docs/creating/data_tables.md
@@ -1,17 +1,17 @@
-# Loading tabular data (`data_sources`)
+# Loading tabular data (`data_tables`)
 
 We have chosen YAML syntax to define Calliope models as it is human-readable.
 However, when you have a large dataset, the YAML files can become large and ultimately not as readable as we would like.
 For instance, for parameters that vary in time we would have a list of 8760 values and timestamps to put in our YAML file!
 
-Therefore, alongside your YAML model definition, you can load tabular data from CSV files (or from in-memory [pandas.DataFrame][] objects) under the `data_sources` top-level key.
+Therefore, alongside your YAML model definition, you can load tabular data from CSV files (or from in-memory [pandas.DataFrame][] objects) under the `data_tables` top-level key.
 As of Calliope v0.7.0, this tabular data can be of _any_ kind.
 Prior to this, loading from file was limited to timeseries data.
 
-The full syntax from loading tabular data can be found in the associated [schema][data-source-schema].
+The full syntax from loading tabular data can be found in the associated [schema][data-table-schema].
 In brief it is:
 
-* **source**: path to file or reference name for an in-memory object.
+* **data**: path to file or reference name for an in-memory object.
 * **rows**: the dimension(s) in your table defined per row.
 * **columns**: the dimension(s) in your table defined per column.
 * **select**: values within dimensions that you want to select from your tabular data, discarding the rest.
@@ -126,9 +126,9 @@ In this section we will show some examples of loading data and provide the equiv
     YAML definition to load data:
 
     ```yaml
-    data_sources:
+    data_tables:
       pv_capacity_factor_data:
-        source: data_sources/pv_resource.csv
+        data: data_tables/pv_resource.csv
         rows: timesteps
         add_dims:
           techs: pv
@@ -181,9 +181,9 @@ In this section we will show some examples of loading data and provide the equiv
     YAML definition to load data:
 
     ```yaml
-    data_sources:
+    data_tables:
       tech_data:
-        source: data_sources/tech_data.csv
+        data: data_tables/tech_data.csv
         rows: [techs, parameters]
     ```
 
@@ -224,9 +224,9 @@ In this section we will show some examples of loading data and provide the equiv
     YAML definition to load data:
 
     ```yaml
-    data_sources:
+    data_tables:
       tech_data:
-        source: data_sources/tech_data.csv
+        data: data_tables/tech_data.csv
         rows: [techs, parameters]
         add_dims:
           costs: monetary
@@ -272,7 +272,7 @@ In this section we will show some examples of loading data and provide the equiv
     1. To limit repetition, we have defined [templates](templates.md) for our costs.
 
 !!! info "See also"
-    Our [data source loading tutorial][loading-tabular-data] has more examples of loading tabular data into your model.
+    Our [data table loading tutorial][loading-tabular-data] has more examples of loading tabular data into your model.
 
 ## Selecting dimension values and dropping dimensions
 
@@ -290,9 +290,9 @@ Data in file:
 YAML definition to load only data from nodes 1 and 2:
 
 ```yaml
-data_sources:
+data_tables:
   tech_data:
-    source: data_sources/tech_data.csv
+    data: data_tables/tech_data.csv
     rows: [techs, parameters]
     columns: nodes
     select:
@@ -312,22 +312,22 @@ You will also need to `drop` the dimension so that it doesn't appear in the fina
 YAML definition to load only data from scenario 1:
 
 ```yaml
-data_sources:
+data_tables:
   tech_data:
-    source: data_sources/tech_data.csv
+    data: data_tables/tech_data.csv
     rows: [techs, parameters]
     columns: scenarios
     select:
       scenarios: scenario1
     drop: scenarios
 ```
 
-You can then also tweak just one line of your data source YAML with an [override](scenarios.md) to point to your other scenario:
+You can then also tweak just one line of your data table YAML with an [override](scenarios.md) to point to your other scenario:
 
 ```yaml
 override:
   switch_to_scenario2:
-    data_sources.tech_data.select.scenarios: scenario2  # (1)!
+    data_tables.tech_data.select.scenarios: scenario2  # (1)!
 ```
 
 1. We use the dot notation as a shorthand for [abbreviate nested dictionaries](yaml.md#abbreviated-nesting).
@@ -348,9 +348,9 @@ For example, to define costs for the parameter `cost_flow_cap`:
     | tech3 | monetary | cost_flow_cap | 20    | 45    | 50    |
 
     ```yaml
-    data_sources:
+    data_tables:
       tech_data:
-        source: data_sources/tech_data.csv
+        data: data_tables/tech_data.csv
         rows: [techs, costs, parameters]
         columns: nodes
     ```
@@ -364,9 +364,9 @@ For example, to define costs for the parameter `cost_flow_cap`:
     | tech3 | 20    | 45    | 50    |
 
     ```yaml
-    data_sources:
+    data_tables:
       tech_data:
-        source: data_sources/tech_data.csv
+        data: data_tables/tech_data.csv
         rows: techs
         columns: nodes
         add_dims:
@@ -384,9 +384,9 @@ Or to define the same timeseries source data for two technologies at different n
     | 2005-01-01 01:00 | 200                              | 200                              |
 
     ```yaml
-    data_sources:
+    data_tables:
       tech_data:
-        source: data_sources/tech_data.csv
+        data: data_tables/tech_data.csv
         rows: timesteps
         columns: [nodes, techs, parameters]
     ```
@@ -401,16 +401,16 @@ Or to define the same timeseries source data for two technologies at different n
     | 2005-01-01 01:00 | 200 |
 
     ```yaml
-    data_sources:
+    data_tables:
       tech_data_1:
-        source: data_sources/tech_data.csv
+        data: data_tables/tech_data.csv
         rows: timesteps
         add_dims:
           techs: tech1
           nodes: node1
           parameters: source_use_max
       tech_data_2:
-        source: data_sources/tech_data.csv
+        data: data_tables/tech_data.csv
         rows: timesteps
         add_dims:
           techs: tech2
@@ -420,10 +420,10 @@ Or to define the same timeseries source data for two technologies at different n
 
 ## Loading CSV files vs `pandas` dataframes
 
-To load from CSV, set the filepath in `source` to point to your file.
+To load from CSV, set the filepath in `data` to point to your file.
 This filepath can either be relative to your `model.yaml` file (as in the above examples) or an absolute path.
 
-To load from a [pandas.DataFrame][], you can specify the `data_source_dfs` dictionary of objects when you initialise your model:
+To load from a [pandas.DataFrame][], you can specify the `data_table_dfs` dictionary of objects when you initialise your model:
 
 ```python
 import calliope
@@ -433,19 +433,19 @@ df2 = pd.DataFrame(...)
 
 model = calliope.Model(
     "path/to/model.yaml",
-    data_source_dfs={"data_source_1": df1, "data_source_2": df2}
+    data_table_dfs={"data_source_1": df1, "data_source_2": df2}
 )
 ```
 
-And then you point to those dictionary keys in the `source` for your data source:
+And then you point to those dictionary keys in the `data` for your data table:
 
 ```yaml
-data_sources:
+data_tables:
   ds1:
-    source: data_source_1
+    data: data_source_1
     ...
   ds2:
-    source: data_source_2
+    data: data_source_2
     ...
 ```
 
@@ -454,7 +454,7 @@ data_sources:
     Rows correspond to your dataframe index levels and columns to your dataframe column levels.
 
     You _cannot_ specify [pandas.Series][] objects.
-    Ensure you convert them to dataframes (`to_frame()`) before adding them to your data source dictionary.
+    Ensure you convert them to dataframes (`to_frame()`) before adding them to your data table dictionary.
 
 ## Important considerations
 
@@ -468,8 +468,8 @@ This could be defined in `rows`, `columns`, or `add_dims`.
     3. `add_dims` to add dimensions.
 This means you can technically select value "A" from dimensions `nodes`, then drop `nodes`, then add `nodes` back in with the value "B".
 This effectively replaces "A" with "B" on that dimension.
-3. The order of tabular data loading is in the order you list the sources.
-If a new table has data which clashes with preceding data sources, it will override that data.
+3. The order of tabular data loading is in the order you list the tables.
+If a new table has data which clashes with preceding tables, it will override that data.
 This may have unexpected results if the files have different dimensions as the dimensions will be broadcast to match each other.
 4. CSV files must have `.csv` in their filename (even if compressed, e.g., `.csv.zip`).
 If they don't, they won't be picked up by Calliope.
@@ -481,7 +481,7 @@ E.g.,
     nodes:
       node1.techs: {tech1, tech2, tech3}
       node2.techs: {tech1, tech2}
-    data_sources:
+    data_tables:
       ...
     ```
 6. We process dimension data after loading it in according to a limited set of heuristics:

diff --git a/docs/creating/index.md b/docs/creating/index.md
@@ -35,7 +35,7 @@ We distinguish between:
 - the model **definition** (your representation of a physical system in YAML).
 
 Model configuration is everything under the top-level YAML key [`config`](config.md).
-Model definition is everything else, under the top-level YAML keys [`parameters`](parameters.md), [`techs`](techs.md), [`nodes`](nodes.md), [`templates`](templates.md), and [`data_sources`](data_sources.md).
+Model definition is everything else, under the top-level YAML keys [`parameters`](parameters.md), [`techs`](techs.md), [`nodes`](nodes.md), [`templates`](templates.md), and [`data_tables`](data_tables.md).
 
 It is possible to define alternatives to the model configuration/definition that you can refer to when you initialise your model.
 These are defined under the top-level YAML keys [`scenarios` and `overrides`](scenarios.md).
@@ -52,7 +52,7 @@ The layout of that directory typically looks roughly like this (`+` denotes dire
     + model_definition
         - nodes.yaml
         - techs.yaml
-    + data_sources
+    + data_tables
         - solar_resource.csv
         - electricity_demand.csv
     - model.yaml
@@ -63,7 +63,7 @@ In the above example, the files `model.yaml`, `nodes.yaml` and `techs.yaml` toge
 This definition could be in one file, but it is more readable when split into multiple.
 We use the above layout in the example models.
 
-Inside the `data_sources` directory, tabular data are stored as CSV files.
+Inside the `data_tables` directory, tabular data are stored as CSV files.
 
 !!! note
     The easiest way to create a new model is to use the `calliope new` command, which makes a copy of one of the built-in examples models:
@@ -85,4 +85,4 @@ The rest of this section discusses everything you need to know to set up a model
 - More details on the [model configuration](config.md).
 - The key parts of the model definition, first, the [technologies](techs.md), then, the [nodes](nodes.md), the locations in space where technologies can be placed.
 - How to use [technology and node templates](templates.md) to reduce repetition in the model definition.
-- Other important features to be aware of when defining your model: defining [indexed parameters](parameters.md), i.e. parameter which are not indexed over technologies and nodes, [loading tabular data](data_sources.md), and defining [scenarios and overrides](scenarios.md).
+- Other important features to be aware of when defining your model: defining [indexed parameters](parameters.md), i.e. parameter which are not indexed over technologies and nodes, [loading tabular data](data_tables.md), and defining [scenarios and overrides](scenarios.md).