0

I've been pulling my hair out for several hours trying to understand what's going on, to no avail so far.

I've got this query on dbt:

{{
  config(
    materialized='incremental',
    unique_key='event_ID'
  )
}}

SELECT
    {{ dbt_utils.star(from=ref('staging_pg_ahoy_events'), relation_alias='events', prefix='event_') }},
    {{ dbt_utils.star(from=ref('staging_pg_ahoy_visits'), relation_alias='visits', prefix='visit_') }}
FROM
    {{ ref('staging_pg_ahoy_events') }} AS events
LEFT JOIN {{ ref('staging_pg_ahoy_visits') }} AS visits ON events.visit_id = visits.id

{% if is_incremental() %}
    WHERE "events"."event_ID" >= (SELECT max("events"."event_ID") FROM {{ this }})
{% endif %}

Along with this config:

version: 2

models:
  - name: facts_ahoy_events
    columns:
      - name: event_ID
        quote: true
        tests:
          - unique
          - not_null

dbt run -m facts_ahoy_events --full-refresh runs successfully, however when I try an incremental backup by dropping the --full-refresh flag, the following error ensues:

10:35:51  1 of 1 START incremental model DBT_PCOISNE.facts_ahoy_events.................... [RUN]
10:35:52  1 of 1 ERROR creating incremental model DBT_PCOISNE.facts_ahoy_events........... [ERROR in 0.88s]
10:35:52  
10:35:52  Finished running 1 incremental model in 3.01s.
10:35:52  
10:35:52  Completed with 1 error and 0 warnings:
10:35:52  
10:35:52  Database Error in model facts_ahoy_events (models/marts/facts/facts_ahoy_events.sql)
10:35:52    000904 (42000): SQL compilation error: error line 41 at position 10
10:35:52    invalid identifier '"events"."event_ID"'

I've gotten used to the case-sensitive column names on Snowflake, but I can't for the life of me figure out what's going on, since the following query run directly on Snowflake, completes:

select "event_ID" from DBT_PCOISNE.FACTS_AHOY_EVENTS limit 10;

Whereas this one expectedly fails:

select event_ID from DBT_PCOISNE.FACTS_AHOY_EVENTS limit 10;

I think I've tried every combination of upper, lower, and mixed casing, each with and without quoting, but all my attempts have failed.

Any help or insight would be greatly appreciated! Thank you

coisnepe
  • 480
  • 8
  • 18

1 Answers1

1

Most probably your column event_ID was created using "" around it which means an identifier was used. Now, using it also requires "" as all column names are capitalized inside Snowflake unless using identifiers. Solution is to either use "" around column name or rename it to lower case using an ALTER.

For DBT you can read more here

Sergiu
  • 4,039
  • 1
  • 13
  • 21
  • Thanks for taking the time to look into! I'm confused though, doesn't the error message prove that that's what the query is doing? `invalid identifier '"events"."event_ID"'` – coisnepe Feb 03 '22 at 13:14
  • 1
    If you check in Snowflake UI History how does the query from DBT looks like? Does it have the "" around the column name? – Sergiu Feb 03 '22 at 13:18
  • Good call on checking the queries log ! I actually combed through the query again and finally got back to the original error, which was caused by the incremental "engine" that uppercased all the columns regardless of the original query and table. This actually has been a documented issue that's supposed to have been fixed in 2019: https://github.com/dbt-labs/dbt-core/issues/1847 – coisnepe Feb 03 '22 at 14:24