Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] dbt unit tests with postgres hstore #150

Open
2 tasks done
vincentclaes opened this issue Sep 12, 2024 · 1 comment
Open
2 tasks done

[Feature] dbt unit tests with postgres hstore #150

vincentclaes opened this issue Sep 12, 2024 · 1 comment
Labels
enhancement New feature or request unit tests Issues related to built-in dbt unit testing functionality

Comments

@vincentclaes
Copy link

Is this a new bug in dbt-core?

  • I believe this is a new bug in dbt-core
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

Whenever my source table has a postgres hstore column type, the unittests will fail with a message like

#  An error occurred during execution of unit test 'unzipped_runs_incremental_model'. There may be an error in the unit test definition: check the data types.
#   Database Error
#    syntax error at or near "USER"
#    LINE 54: ...amp with time zone) as completed_at, cast(null as USER-DEFIN...
#                                  

It has either USER or USER-DEFINED in it and i figured out it always points to an hstore column.

Expected Behavior

That my tests would run without an issue.
Definitely if you do not specify the column in your tests.
Or is there a way to mock the input table so we can omit the hstore column?

Steps To Reproduce

  1. Create postgres db with at least 1 hstore column
  2. Write a test (does not has to include the hstore column)
  3. Run the test

Relevant log output

#  An error occurred during execution of unit test 'unzipped_runs_incremental_model'. There may be an error in the unit test definition: check the data types.
#   Database Error
#    syntax error at or near "USER"
#    LINE 54: ...amp with time zone) as completed_at, cast(null as USER-DEFIN...
#

Environment

- OS: ubuntu 20.4
- Python: 3.20
- dbt: 1.8.6

Which database adapter are you using with dbt?

postgres

Additional Context

dbt unit test:

# I keep getting the same error
dbt-labs/dbt-core#15:21:59    Runtime Error in unit_test unzipped_runs_incremental_model (models/tests/tableau/preprocessed/unzipped_runs.yml)
#  An error occurred during execution of unit test 'unzipped_runs_incremental_model'. There may be an error in the unit test definition: check the data types.
##   Database Error
##    syntax error at or near "USER"
##    LINE 54: ...amp with time zone) as completed_at, cast(null as USER-DEFIN...
##                                                                  ^
dbt-labs/dbt-core#15:21:59
dbt-labs/dbt-core#15:21:59  Done. PASS=0 WARN=0 ERROR=1 SKIP=0 TOTAL=1
# I am doing something wrong or there is a bug in dbt unit tests
# Commenting the tests for now and move on...

unit_tests:
  - name: unzipped_runs_incremental_model
    model: unzipped_runs
    overrides:
      macros:
        is_incremental: false
    given:
      - input: source("magic_star", "runs") # upstream dependent model of the current model
        rows:
          - {id: 1}
          - {id: 2}

      - input: this # contents of current model
        rows:
          - { pipeline_run_id: 1}
    expect:
      rows:
          - { pipeline_run_id: 2}

dbt model

{{
    macro_incremental_config(
        schema="preprocessed",
        unique_fields=["pipeline_run_id"],
        index_fields=["test_card_id"],
    )
}}

with
    runs as (
        select *
        from {{ source("magic_star", "runs") }}
        {% if is_incremental() %}
            where id > (select max(pipeline_run_id) from {{ this }})
        {% endif %}
    )
select
    id as "pipeline_run_id",
    study_id as "study_id",
    assay_id as "assay_id",
    test_card_id,
    processor_version,
    decorator_host as "decorator_host",
    started_at as "started_at",
    date(started_at) as "start_date",
    started_at::time as start_time,
    completed_at as "completed_at",
    date(completed_at) as "completed_date",
    completed_at::time as completed_time,
    metadata -> 's3-input-path' as "s3_input",
    metadata -> 's3-output-directory' as "s3_output",
    metadata -> 'consumables/detection-mix-id' as "detection_mix_id",
    metadata -> 'consumables/sample-id' as "sample_id",
    metadata -> 'created-by/reader-id' as "reader_id",
    metadata -> 'created-by/reader-type' as "reader_type",
    protocol_name as "protocol_name",
    protocol_version as "protocol_version"
from runs

it even fails with the same message if I simplify to

select id from {{ source("magic_star", "runs") }}
@vincentclaes vincentclaes added bug Something isn't working triage labels Sep 12, 2024
@dbeatty10 dbeatty10 added the unit tests Issues related to built-in dbt unit testing functionality label Sep 12, 2024
@dbeatty10 dbeatty10 transferred this issue from dbt-labs/dbt-core Sep 12, 2024
@amychen1776 amychen1776 added enhancement New feature or request and removed bug Something isn't working triage labels Sep 24, 2024
@amychen1776 amychen1776 changed the title [Bug] dbt unit tests with postgres hstore [Feature] dbt unit tests with postgres hstore Sep 24, 2024
@amychen1776
Copy link

amychen1776 commented Sep 24, 2024

Hi @vincentclaes , thank you for opening this up! It seems like at the moment, we only support data types that:

  1. comes from the information_schema.columns query (or equivalent), and
  2. cast(null as DATA_TYPE_NAME) works

So we are going to do two things related to this issue:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request unit tests Issues related to built-in dbt unit testing functionality
Projects
None yet
Development

No branches or pull requests

3 participants