Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Source Hubspot: Add dbt converter #62

Open
wants to merge 32 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
159 changes: 153 additions & 6 deletions connectors/source_hubspot/README.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,156 @@
# Airbyte source_hubspot dbt Package
# Hubspot Support Airbyte dbt Package

This package contains dbt models for Airbyte source_hubspot source.
---

What it includes:
- This package contains dbt models to work with Airbyte Hubspot Support connector.
- The package is compatible with latest version of Airbyte Hubspot Support connector.
- Currently, it is limited to creating transformations compatible with [Fivetran's modeling dbt package](https://github.com/fivetran/dbt_hubspot/tree/main).
- In the future, specific models will be applied directly to Airbyte connector output. If you have an idea or want to propose an analytical model for this source, please refer to the contributing guide, which explains how to propose a new transformation model.
- This package was tested with BigQuery, Snowflake, and Postgres data warehouses.

* A complete source description
* ERD model for the source
* Diagram documentation for the source
---

## 🎯 Intructions how to use

### Airbyte dbt Package

For now Airbyte dbt packages aren't versioned. You must configure using git and subdirectory. For now there isn't any transformation model directly applied to this package. But you can generate docs and tests with dbt.

Create the following files:

**`dbt_project.yml`**

```yaml
vars:
using_fivetran_model: False
airbyte_database: "airbyte_db_default"
airbyte_schema: "airbyte_dbt_hubspot_support"

hubspot_database: "airbyte_db_default"
hubspot_schema: "airbyte_dbt_source_hubspot"
```

**`packages.yml`**

```yaml
packages:
- git: "https://github.com/airbytehq/airbyte-dbt-models.git"
subdirectory: "connectors/source_hubspot"
```

After you can run `dbt tests` or `dbt docs generate` to have a preview of Airbyte output data.

### Fivetran Hubspot Modeling dbt package

This package transforms Airbyte connector output data, making it compatible with Fivetran's Hubspot dbt package. You can check the analytical models Fivetran creates [here](https://github.com/fivetran/dbt_hubspot/tree/main?tab=readme-ov-file#-what-does-this-dbt-package-do). The link also provides information about how the package works and what is configurable.

Create the require files to use Airbyte and Fivetran dbt packages:

**`packages.yml`**

```yaml
packages:
- git: "https://github.com/airbytehq/airbyte-dbt-models.git"
subdirectory: "connectors/source_hubspot_support"

- package: fivetran/hubspot
version: [">=0.16.0", "<0.17.0"]
```

This is a default variable definition you must configure to have the models created.
At the moment this package doesn't support (schedules, domains, user tags, ticket form history and organization tags) for that reason keep those variables set to `False`.
Variables starting with the prefix `hubspot_..._identifier` represent the names of tables generated by the Airbyte connector. If you configured your sync with this prefix, ensure you edit it accordingly.

**`dbt_project.yml`**

```yaml
vars:
# Required by Airbyte dbt model
using_fivetran_model: True
airbyte_database: "airbyte_db_default"
airbyte_schema: "airbyte_dbt_source_hubspot"

# Required by Fivetran dbt model
hubspot_database: "airbyte_db_default"
hubspot_schema: "airbyte_dbt_source_hubspot"

hubspot_marketing_enabled: false # Disables all marketing models
hubspot_contact_enabled: false # Disables the contact models
hubspot_contact_list_enabled: false # Disables contact list models
hubspot_contact_list_member_enabled: false # Disables contact list member models
hubspot_contact_property_enabled: false # Disables the contact property models
hubspot_contact_property_history_enabled: false # Disables the contact property history models
hubspot_email_event_enabled: false # Disables all email_event models and functionality
hubspot_email_event_bounce_enabled: false
hubspot_email_event_click_enabled: false
hubspot_email_event_deferred_enabled: false
hubspot_email_event_delivered_enabled: false
hubspot_email_event_dropped_enabled: false
hubspot_email_event_forward_enabled: false
hubspot_email_event_open_enabled: false
hubspot_email_event_print_enabled: false
hubspot_email_event_sent_enabled: false
hubspot_email_event_spam_report_enabled: false
hubspot_email_event_status_change_enabled: false

hubspot_contact_merge_audit_enabled: false # Enables the use of the CONTACT_MERGE_AUDIT table (deprecated by Hubspot v3 API) for removing merged contacts in the final models.
# If false, ~~~contacts will still be merged~~~, but using the CONTACT.property_hs_calculated_merged_vids field (introduced in v3 of the Hubspot CRM API)
# Default = false
# Sales

hubspot_sales_enabled: false # Disables all sales models
hubspot_company_enabled: false
hubspot_company_property_history_enabled: false # Disable the company property history models
hubspot_deal_enabled: false
hubspot_merged_deal_enabled: false # Enables the merged_deal table, which will be used to filter out merged deals from the final deal models. False by default. Note that `hubspot_sales_enabled` and `hubspot_deal_enabled` must not be set to False.
hubspot_deal_company_enabled: false
hubspot_deal_contact_enabled: false
hubspot_deal_property_history_enabled: false # Disables the deal property history tables
hubspot_engagement_enabled: false # Disables all engagement models and functionality
hubspot_engagement_contact_enabled: false
hubspot_engagement_company_enabled: false
hubspot_engagement_deal_enabled: false
hubspot_engagement_call_enabled: false
hubspot_engagement_email_enabled: false
hubspot_engagement_meeting_enabled: false
hubspot_engagement_note_enabled: false
hubspot_engagement_task_enabled: false
hubspot_owner_enabled: false
hubspot_property_enabled: false # Disables property and property_option tables

# Service
hubspot_service_enabled: false # Enables all service models
hubspot_ticket_deal_enabled: false

contact_identifier: "contacts"
contact_lists_identifier: "contact_lists"
deals_identifier: "deals"
companies_identifier: "companies"
company_property_history_identifier: "companies_property_history"
contact_list_membership_identifier: "contacts_list_memberships"
contact_property_history_identifier: "contacts_property_history"
deal_pipeline_identifier: "deal_pipelines"
deal_property_history_identifier: "deals_property_history"
email_event_identifier: "email_events"
engagements_calls_identifier: "engagements_calls"
engagement_email_identifier: "engagement_emails"
ticket_pipeline_identifier: "ticket_pipelines"
tickets_identifier: "tickets"
engagement_note_identifier: "engagement_notes"

```

After run `dbt run`, you can see the models being created.

---

## :package: Package Maintenance

- This package is maintained by the Airbyte Community.
- You can contribute any time please read the Contributing Guidelines or enter the Airbyte Slack Channel `#airbyte-dbt-packages`

## Supported models

- `contact_list`
- `deal`
-
107 changes: 107 additions & 0 deletions connectors/source_hubspot/integration_tests/dbt_project.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
name: integration_test_hubspot

config-version: 2

version: 0.1.0

profile: integration_tests

model-paths:
- models

macro-paths:
- macros

target-path: target

clean-targets:
- target
- dbt_modules
- logs

require-dbt-version:
- ">=1.0.0"
- <2.0.0

models:
airbyte_dbt_source_amplitude:
materialized: view
+schema: dbt_source_hubspot
staging:
materialized: view
tmp:
materialized: view

vars:
# Required by Airbyte dbt model
using_fivetran_model: True
airbyte_database: "airbyte_db_default"
airbyte_schema: "airbyte_dbt_source_hubspot"

# Required by Fivetran dbt model
hubspot_database: "airbyte_db_default"
hubspot_schema: "airbyte_dbt_source_hubspot"

hubspot_marketing_enabled: false # Disables all marketing models
hubspot_contact_enabled: false # Disables the contact models
hubspot_contact_list_enabled: false # Disables contact list models
hubspot_contact_list_member_enabled: false # Disables contact list member models
hubspot_contact_property_enabled: false # Disables the contact property models
hubspot_contact_property_history_enabled: false # Disables the contact property history models
hubspot_email_event_enabled: false # Disables all email_event models and functionality
hubspot_email_event_bounce_enabled: false
hubspot_email_event_click_enabled: false
hubspot_email_event_deferred_enabled: false
hubspot_email_event_delivered_enabled: false
hubspot_email_event_dropped_enabled: false
hubspot_email_event_forward_enabled: false
hubspot_email_event_open_enabled: false
hubspot_email_event_print_enabled: false
hubspot_email_event_sent_enabled: false
hubspot_email_event_spam_report_enabled: false
hubspot_email_event_status_change_enabled: false

hubspot_contact_merge_audit_enabled: false # Enables the use of the CONTACT_MERGE_AUDIT table (deprecated by Hubspot v3 API) for removing merged contacts in the final models.
# If false, ~~~contacts will still be merged~~~, but using the CONTACT.property_hs_calculated_merged_vids field (introduced in v3 of the Hubspot CRM API)
# Default = false
# Sales

hubspot_sales_enabled: false # Disables all sales models
hubspot_company_enabled: false
hubspot_company_property_history_enabled: false # Disable the company property history models
hubspot_deal_enabled: false
hubspot_merged_deal_enabled: false # Enables the merged_deal table, which will be used to filter out merged deals from the final deal models. False by default. Note that `hubspot_sales_enabled` and `hubspot_deal_enabled` must not be set to False.
hubspot_deal_company_enabled: false
hubspot_deal_contact_enabled: false
hubspot_deal_property_history_enabled: false # Disables the deal property history tables
hubspot_engagement_enabled: false # Disables all engagement models and functionality
hubspot_engagement_contact_enabled: false
hubspot_engagement_company_enabled: false
hubspot_engagement_deal_enabled: false
hubspot_engagement_call_enabled: false
hubspot_engagement_email_enabled: false
hubspot_engagement_meeting_enabled: false
hubspot_engagement_note_enabled: false
hubspot_engagement_task_enabled: false
hubspot_owner_enabled: false
hubspot_property_enabled: false # Disables property and property_option tables

# Service
hubspot_service_enabled: false # Enables all service models
hubspot_ticket_deal_enabled: false

contact_identifier: "contacts"
contact_lists_identifier: "contact_lists"
deals_identifier: "deals"
companies_identifier: "companies"
company_property_history_identifier: "companies_property_history"
contact_list_membership_identifier: "contacts_list_memberships"
contact_property_history_identifier: "contacts_property_history"
deal_pipeline_identifier: "deal_pipelines"
deal_property_history_identifier: "deals_property_history"
email_event_identifier: "email_events"
engagements_calls_identifier: "engagements_calls"
engagement_email_identifier: "engagement_emails"
ticket_pipeline_identifier: "ticket_pipelines"
tickets_identifier: "tickets"
engagement_note_identifier: "engagement_notes"
13 changes: 13 additions & 0 deletions connectors/source_hubspot/integration_tests/package-lock.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
packages:
- local: ../
- package: fivetran/hubspot
version: 0.18.0
- package: fivetran/hubspot_source
version: 0.15.0
- package: fivetran/fivetran_utils
version: 0.4.10
- package: dbt-labs/spark_utils
version: 0.3.0
- package: dbt-labs/dbt_utils
version: 1.2.0
sha1_hash: 47c91937972f9cab62436ed1528654bc219bf3b8
11 changes: 11 additions & 0 deletions connectors/source_hubspot/integration_tests/packages.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
packages:
- local: ../

- package: fivetran/hubspot
version: [">=0.18.0", "<0.19.0"]

- package: fivetran/hubspot_source
version: [">=0.15.0", "<0.16.0"]

- package: fivetran/fivetran_utils
version: [">=0.4.0", "<0.5.0"]
1 change: 1 addition & 0 deletions connectors/source_hubspot/integration_tests/vars
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{airbyte_database: $AB_DB, hubspot_database: $AB_DB}
17 changes: 17 additions & 0 deletions connectors/source_hubspot/models/fivetran_converter/company.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
SELECT
id AS company_id,
archived AS is_company_deleted,
{{ dbt.current_timestamp() }} AS _fivetran_synced,
properties_name AS company_name,
properties_description AS description,
CAST(properties_createdate AS {{ dbt.type_timestamp() }}) AS created_date,
properties_industry AS industry,
properties_address AS street_address,
properties_address2 AS street_address_2,
properties_city AS city,
properties_state AS state,
properties_country AS country,
CAST(properties_annualrevenue AS {{ dbt.type_float() }}) AS company_annual_revenue
FROM
{{ source('source_hubspot', 'companies') }}

43 changes: 43 additions & 0 deletions connectors/source_hubspot/models/fivetran_converter/company.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
version: 2

models:
- name: company
schema: "{{ var('airbyte_schema', target.schema) }}"
database: "{{ var('airbyte_database', target.database) }}"
identifier: "{{ var('company_identifier', 'company') }}"
description: Transformed data from HubSpot companies table.
config:
+materialized: table
+schema: hubspot
+enabled: "{{ var('using_fivetran_model', False) }}"
columns:
- name: company_id
description: Unique identifier for the company
tests:
- unique
- not_null
- name: is_company_deleted
description: Indicates whether the company is archived or active
- name: _fivetran_synced
description: Timestamp of when this record was last synced by Fivetran
- name: company_name
description: The name of the company
- name: description
description: Company description
- name: created_date
description: Date and time when the company was created
- name: industry
description: The industry of the company
- name: street_address
description: Primary street address of the company
- name: street_address_2
description: Secondary street address of the company
- name: city
description: City where the company is located
- name: state
description: State where the company is located
- name: country
description: Country where the company is located
- name: company_annual_revenue
description: Annual revenue of the company

Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
SELECT
{{ dbt.current_timestamp() }} AS _fivetran_synced,
companyId AS company_id,
property AS field_name,
sourceType AS change_source,
sourceId AS change_source_id,
CAST(timestamp AS {{ dbt.type_timestamp() }}) AS change_timestamp,
value AS new_value,
updatedByUserId AS updated_by_user_id,
archived AS is_archived
FROM
{{ source('source_hubspot', 'companies_property_history') }}
Loading
Loading