After we have created all required dimension tables, we can now create the fact table for fct_sales
.
Let’s create the new dbt model files that will contain our transformation code. Under adventureworks/models/marts, create two files:
fct_sales.sql
: This file will contain our SQL transformation code.fct_sales.yml
: This file will contain our documentation and tests forfct_sales
.
adventureworks/models/
└── marts
├── fct_sales.sql
├── fct_sales.yml
To answer the business questions, we need columns from both salesorderheader
and salesorderdetail
. Let’s reflect that in fct_sales.sql
:
with stg_salesorderheader as (
select
salesorderid,
customerid,
creditcardid,
shiptoaddressid,
status as order_status,
cast(orderdate as date) as orderdate
from {{ ref('salesorderheader') }}
),
stg_salesorderdetail as (
select
salesorderid,
salesorderdetailid,
productid,
orderqty,
unitprice,
unitprice * orderqty as revenue
from {{ ref('salesorderdetail') }}
)
...
The grain of the fct_sales
table is one record in the SalesOrderDetail table, which describes the quantity of a product within a SalesOrderHeader. So we perform a join between salesorderheader
and salesorderdetail
to achieve that grain.
...
select
...
from stg_salesorderdetail
inner join stg_salesorderheader on stg_salesorderdetail.salesorderid = stg_salesorderheader.salesorderid
Next, we create the surrogate key to uniquely identify each row in the fact table. Each row in the fct_sales
table can be uniquely identified by the salesorderid
and the salesorderdetailid
which is why we use both columns in the generate_surrogate_key()
macro.
...
select
{{ dbt_utils.generate_surrogate_key(['stg_salesorderdetail.salesorderid', 'salesorderdetailid']) }} as sales_key,
...
from stg_salesorderdetail
inner join stg_salesorderheader on stg_salesorderdetail.salesorderid = stg_salesorderheader.salesorderid
You can now select the fact table columns that will help us answer the business questions identified earlier. We want to be able to calculate the amount of revenue, and therefore we include a column revenue per sales order detail which is calculated by unitprice * orderqty as revenue
.
...
select
{{ dbt_utils.generate_surrogate_key(['stg_salesorderdetail.salesorderid', 'salesorderdetailid']) }} as sales_key,
stg_salesorderdetail.salesorderid,
stg_salesorderdetail.salesorderdetailid,
stg_salesorderdetail.unitprice,
stg_salesorderdetail.orderqty,
stg_salesorderdetail.revenue
from stg_salesorderdetail
inner join stg_salesorderheader on stg_salesorderdetail.salesorderid = stg_salesorderheader.salesorderid
We want to be able to slice and dice our fact table against the dimension tables we have created in the earlier step. So we need to create the foreign surrogate keys that will be used to join the fact table back to the dimension tables.
We achieve this by applying the generate_surrogate_key()
macro to the same unique id columns that we had previously used when generating the surrogate keys in the dimension tables.
...
select
{{ dbt_utils.generate_surrogate_key(['stg_salesorderdetail.salesorderid', 'salesorderdetailid']) }} as sales_key,
{{ dbt_utils.generate_surrogate_key(['productid']) }} as product_key,
{{ dbt_utils.generate_surrogate_key(['customerid']) }} as customer_key,
{{ dbt_utils.generate_surrogate_key(['creditcardid']) }} as creditcard_key,
{{ dbt_utils.generate_surrogate_key(['shiptoaddressid']) }} as ship_address_key,
{{ dbt_utils.generate_surrogate_key(['order_status']) }} as order_status_key,
{{ dbt_utils.generate_surrogate_key(['orderdate']) }} as order_date_key,
stg_salesorderdetail.salesorderid,
stg_salesorderdetail.salesorderdetailid,
stg_salesorderdetail.unitprice,
stg_salesorderdetail.orderqty,
stg_salesorderdetail.revenue
from stg_salesorderdetail
inner join stg_salesorderheader on stg_salesorderdetail.salesorderid = stg_salesorderheader.salesorderid
You may choose from one of the following materialization types supported by dbt:
- View
- Table
- Incremental
It is common for fact tables to be materialized as incremental
or table
depending on the data volume size. As a rule of thumb, if you are transforming millions or billions of rows, then you should start using the incremental
materialization. In this example, we have chosen to go with table
for simplicity.
Alongside our fct_sales.sql
model, we can populate the corresponding fct_sales.yml
file to document and test our model.
version: 2
models:
- name: fct_sales
columns:
- name: sales_key
description: The surrogate key of the fct sales
tests:
- not_null
- unique
- name: product_key
description: The foreign key of the product
tests:
- not_null
- name: customer_key
description: The foreign key of the customer
tests:
- not_null
...
- name: orderqty
description: The quantity of the product
tests:
- not_null
- name: revenue
description: The revenue obtained by multiplying unitprice and orderqty
Execute the dbt run and dbt test commands to run and test your dbt models:
dbt run && dbt test
Great work, you have successfully created your very first fact and dimension tables! Our dimensional model is now complete!! 🎉