[Schema] Exposure costs and metrics #194

duncandewhurst · 2023-08-16T22:32:24Z

From GFDRR/rdls-spreadsheet-template#3 (comment):

Why is the exposure_cost sheet populated? If I understood correctly, the dataset doesn't describe the cost of buildings, it only describes their area.

At the moment, the "cost" of exposure is limited to monetary currencies. But the value represented by an exposure dataset might be intangible, or just a proxy to later calculate the economic value; in my experience it is actually pretty uncommon to use an exposure dataset that already comes into economic terms. In this specific case it is a value of built-up area over total pixel area. In other cases, the value could be building height, or volume, population density or others. A range of different metrics could be represented by exposure, in order to measure the cost.

I see two options:
1. Put `cost` field as optional, use it only if actually a currency value. Don't specify exposure metric.

2. Add exposure `metric` field as open codelist

@matamadio do analysts need to know the exposure metric at the point of selecting a dataset?

The text was updated successfully, but these errors were encountered:

matamadio · 2023-08-17T09:46:57Z

@matamadio do analysts need to know the exposure metric at the point of selecting a dataset?

To me it is one of the most key information to provide for exposure; similarly to hazard imt. It doesn't need to be within "cost" array, it can be at top level as exposure/metric

odscjen · 2023-08-21T10:12:01Z

This sounds as though we need a new object in addition to exposure.cost, e.g.

"metrics": {
          "title": "Asset metrics",
          "type": "array",
          "description": "The non-monetary exposure metrics associated with specific elements of assets detailed in the dataset. If a metric is measured exclusively in monetary values use `cost`.",
          "items": {
            "$ref": "#/$defs/Metric"
          },
          "minItems": 1,
          "uniqueItems": true
        }

where Metric is

"Metric": {
      "title": "Asset metric",
      "type": "object",
      "description": "The metric associated with specific elements of assets detailed in the dataset.",
      "required": [
        "id",
        "type",
        "unit"
      ],
      "properties": {
        "id": {
          "title": "Identifier",
          "type": "string",
          "description": "A locally unique identifier for this metric.",
          "minLength": 1
        },
        "type": {
          "title": "Metric type",
          "description": "The type of the metric, from the closed [cost type codelist](https://rdl-standard.readthedocs.io/en/{{version}}/reference/codelists/#cost_type).",
          "type": "string",
          "codelist": "cost_type.csv",
          "openCodelist": false,
          "enum": [
            "structure",
            "content",
            "product",
            "disruption"
          ]
        },
        "unit": {
          "title": "Metric unit",
          "type": "string",
          "description": "The unit in which the metric is specified, from the open [impact_unit codelist](https://rdl-standard.readthedocs.io/en/{{version}}/reference/codelists/#impact_unit.",
          "codelist": "impact_unit.csv",
          "openCodelist": true
        }
      },
      "minProperties": 1
    }

Is this object likely to be potentially needed anywhere else? If not it doesn't need to be in $defs and can just go straight into exposure.

I think though if we go with this we'll need to revise some of the codelist names, rename 'cost_type.csv' to 'asset_type.csv' and rename 'impact_unit.csv' to 'metric_unit.csv'. My logic for the second of these is that impact is a specific type of metric but happy to have alternative names for this one suggested. Or alternatively @matamadio @stufraser1 is 'impact_unit.csv' not appropriate for this exposure metric and do we need an entirely new codelist for this field?

matamadio · 2023-08-21T15:23:50Z

Thanks for the proposal; I made a counterproposal splitting metric into 2 arrays:

Exposure

category
taxonomy
metric
- monetary (cost)
  - type (as is)
  - unit (as is) - separate from vulnerability/cost
- non-monetary
  - type (new codelist)
  - unit (new codelist)

If this makes sense:

rename cost as monetary (also codelist monetary_type.csv and monetary_unit.csv)
- full list of currencies as unit is ok, but realistically we would need something like USD (year), PPP (year), and similar comparable units
add new array non-monetary and associated type and unit open codelists (nonmonetary_type.csv and nonmonetary_unit.csv)
- nonmonetary_type.csv same as monetary_type.csv with the inclusion of "population"
- nonmonetary_unit.csv as open codelist, existing values:
  - Area (extent)
  - Count
  - Density
  - Time (period)
  - ...more to add
- when cost type = disruption, user might need to quantify it in terms of production time rather the monetary

duncandewhurst · 2023-08-22T00:46:46Z

Thanks, both. I'll have a think about modelling options.

duncandewhurst · 2023-08-22T03:12:03Z

full list of currencies as unit is ok, but realistically we would need something like USD (year), PPP (year), and similar comparable units

Does PPP stand for purchasing power parities in this context? If so, PPPs seem more like conversion rates than units. Can you share a link to a dataset in which the exposure metric is expressed in purchasing power parities?
Are you suggesting that we add a field for the value date of the monetary amounts in a dataset?

nonmonetary_unit.csv as open codelist, existing values:

Area (extent)

Count

Density

Time (period)

...more to add

As discussed in #75 (comment), these are quantity kinds rather than units. Units would be things like square metres (for area quantities) or hours (for time quantities). I agree that it is more useful to model quantity kinds than specific units, since it should be possible to convert between units within a quantity kind (e.g. hours to minutes), but not between units of different quantity times (e.g. square metres to hours). I would name this field accordingly (quantityKind) and base it on a subset of the QUDT quantity kinds vocabulary, which already has codes for Area, Count, Density, Time and Currency.

Can you share a link to a dataset in which the exposure metric is expressed as a quantity of density? I'm assuming you don't mean the QUDT definition of density, which is mass per unit volume so it would be good to work out what the correct quantity kind is.

stufraser1 · 2023-08-22T12:56:50Z

At the moment, the "cost" of exposure is limited to monetary currencies. But the value represented by an exposure dataset might be intangible, or just a proxy to later calculate the economic value

Agree - number of buildings / number of people / km of roads (e.g. per grid cell) are commonly used as well as total value (replacement cost / insured value) per grid cell or per building.

in my experience it is actually pretty uncommon to use an exposure dataset that already comes into economic terms.

It is common in national level datasets and some global datasets, but maybe not in the ones used in examples so far. See Central Asia datasets, Africa R5, GEM's global exposure model, as just a few examples. It is also the case as stated that the value might be area or length or count.

To me it is one of the most key information to provide for exposure

Agree -- cost type or (monetary/non-monetary) value of the exposure needs to be readily visible in metadata.

Can you share a link to a dataset in which the exposure metric is expressed as a quantity of density? I'm assuming you don't mean the QUDT definition of density, which is mass per unit volume

This refers more to population density - relating number in a given geographic area. 'Count' would cover this - number of building / population, which would be given in the data as a count per raster grid cell. I haven't yet seen an exposure dataset with the value given as 'no. buildings per km2'.

I think the suggestion from @matamadio works to make it clearer that we can include monetary and non-monetary values and the latter should include Area and Count. I don't think we need Time/Duration here as a metric. In my experience exposure isn't ever given a time value. We might estimate the disruption time as a loss, or (for insurance datasets only) identify an insured value for business interruption for a building, but we wouldn't record a unit of time in the exposure dataset - I can't think of an example where a road or building would be attributed a time value - it wouldn't mean anything practically.

I would request that the data structure allows one or more of count, area AND cost to be included in the same dataset - I can point to examples where the cost is derived from one or both of area and count, and all pieces of data are included in the final data.

matamadio · 2023-08-22T15:48:16Z

Does PPP stand for purchasing power parities in this context? If so, PPPs seem more like conversion rates than units. Can you share a link to a dataset in which the exposure metric is expressed in purchasing power parities?

Yes, sometimes costs are expressed as PPP of local currency into USD. Anyway, not strictly necessary.

I would request that the data structure allows one or more of count, area AND cost to be included in the same dataset - I can point to examples where the cost is derived from one or both of area and count, and all pieces of data are included in the final data.

Agree on this solution.

odscjen · 2023-08-22T15:50:05Z

Great so combining all of this we could remove exposure.cost and replace it with exposure.metrics which would be an object holding 2 arrays, one of monetary metrics and one of non-monetary metrics. This would allow for multiple metrics to be included for a single dataset. We could keep using Cost as the monetary items (which is good as we still use Cost in Loss as well) and add an additional $defs/Metric for the non-monetary metric items.

{
  "metrics": {
    "title": "Asset metrics",
    "type": "object",
    "description": "The metrics associated with specific elements of assets detailed in the dataset.",
    "properties": {
      "monetary": {
        "title": "Monetary asset metrics",
        "type": "array",
        "description": "The monetary exposure metrics associated with specific elements of assets detailed in the dataset.",
        "items": {
          "$ref": "#/$defs/Cost"
        },
        "minItems": 1,
        "uniqueItems": true
      },
      "non_monetary": {
        "title": "Non-monetary asset metrics",
        "type": "array",
        "description": "The non-monetary exposure metrics associated with specific elements of assets detailed in the dataset.",
        "items": {
          "$ref": "#/$defs/Metric"
        },
        "minItems": 1,
        "uniqueItems": true
      }
    },
    "minProperties": 1
  }
}

{
  "$defs":{
    "Metric": {
      "title": "Asset metric",
      "type": "object",
      "description": "The metric associated with specific elements of assets detailed in the dataset.",
      "required": [
        "id",
        "type",
        "quantity_kind"
      ],
      "properties": {
        "id": {
          "title": "Identifier",
          "type": "string",
          "description": "A locally unique identifier for this metric.",
          "minLength": 1
        },
        "type": {
          "title": "Metric type",
          "description": "The type of the asset, from the closed [cost type codelist](https://rdl-standard.readthedocs.io/en/{{version}}/reference/codelists/#cost_type).",
          "type": "string",
          "codelist": "cost_type.csv",
          "openCodelist": false,
          "enum": [
            "structure",
            "content",
            "product",
            "disruption"
          ]
        },
        "quantity_kind": {
          "title": "Quantity kind",
          "type": "string",
          "description": "The kind of quantity in which the metric is specified, from the open [quantity kind codelist](https://rdl-standard.readthedocs.io/en/{{version}}/reference/codelists/#quantity_kind.",
          "codelist": "quantity_kind.csv",
          "openCodelist": true
        }
      },
      "minProperties": 1
    }
  }
}

with the quantity kind codes taken as the most relevant selection in the QUDT quantity kinds vocabulary

Code	Title
area	Area
count	Count
length	Length

One issue with this is that as it stands there isn't a way of expressing that a metric is relating to a population. Does it need to be added to the cost_type codelist? And would it make sense to rename this codelist to metric_type?

matamadio · 2023-08-23T08:55:57Z

Thanks Jen. Agree on the solution, including renaming as metric_type and including population.

duncandewhurst · 2023-08-23T21:34:07Z

It seems to me that there are more similarities than differences between monetary and non-monetary metrics so I would lean towards having a single metrics array.

What are the advantages of separating monetary and non-monetary metrics in the data model instead of having a single metrics array and using the quantity_kind field (with 'Currency' as an option) as a discriminator?

From a general usability point of view, there are some advantages to having a single metrics array: fewer sheets in the spreadsheet representation and I would've thought it would be easier for users to see all of the metrics in a dataset in a single list/table/sheet than to have them split into separate lists.

However, happy to hear if there is a risk-specific reason for separating them!

stufraser1 · 2023-08-24T07:22:18Z

This does seem easier to use and communicate range of metrics and I don't think there is a need to have them in two lists/array

matamadio · 2023-08-24T08:34:31Z

Ok for single array grouping based on quantity_kind

duncandewhurst mentioned this issue Aug 16, 2023

New version for testing GFDRR/rdls-spreadsheet-template#3

Closed

matamadio changed the title ~~Exposure costs and metrics~~ [Schema] Exposure costs and metrics Aug 17, 2023

matamadio mentioned this issue Aug 18, 2023

[Docs update] Examples to be included #135

Open

24 tasks

duncandewhurst self-assigned this Aug 22, 2023

odscjen mentioned this issue Aug 24, 2023

add metric object, rename cost_type to metric_type, add quantity_kind… #204

Merged

3 tasks

odscjen closed this as completed in #204 Aug 29, 2023

duncandewhurst mentioned this issue Aug 29, 2023

rdls_schema.json: Update required fields #229

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Schema] Exposure costs and metrics #194

[Schema] Exposure costs and metrics #194

duncandewhurst commented Aug 16, 2023

matamadio commented Aug 17, 2023 •

edited

Loading

odscjen commented Aug 21, 2023

matamadio commented Aug 21, 2023 •

edited

Loading

duncandewhurst commented Aug 22, 2023

duncandewhurst commented Aug 22, 2023

stufraser1 commented Aug 22, 2023 •

edited

Loading

matamadio commented Aug 22, 2023

odscjen commented Aug 22, 2023

matamadio commented Aug 23, 2023

duncandewhurst commented Aug 23, 2023

stufraser1 commented Aug 24, 2023

matamadio commented Aug 24, 2023

[Schema] Exposure costs and metrics #194

[Schema] Exposure costs and metrics #194

Comments

duncandewhurst commented Aug 16, 2023

matamadio commented Aug 17, 2023 • edited Loading

odscjen commented Aug 21, 2023

matamadio commented Aug 21, 2023 • edited Loading

Exposure

duncandewhurst commented Aug 22, 2023

duncandewhurst commented Aug 22, 2023

stufraser1 commented Aug 22, 2023 • edited Loading

matamadio commented Aug 22, 2023

odscjen commented Aug 22, 2023

matamadio commented Aug 23, 2023

duncandewhurst commented Aug 23, 2023

stufraser1 commented Aug 24, 2023

matamadio commented Aug 24, 2023

matamadio commented Aug 17, 2023 •

edited

Loading

matamadio commented Aug 21, 2023 •

edited

Loading

stufraser1 commented Aug 22, 2023 •

edited

Loading