1

Why I get a "Syntax error:L compilation error: (line ..." when trying to create a Snowflake UDTF? If I comment the line: "AND IDENTIFIER(baseline) = I.COMP" the function compiles correctly but do not generate the expected results.

SQL is the handler for the function.

CREATE OR REPLACE FUNCTION fn_Shopper_Insights(dx_id varchar, business varchar, baseline varchar)    
   RETURNS TABLE(name varchar, value varchar)    
   LANGUAGE SQL    
   AS    
$$    
   SELECT V.name, V.value    
   FROM (    
         SELECT name, value    
         FROM (    
           SELECT    
               Q1.GRP AS NAME,    
               Q1.AGG AS VALUE,    
               ROW_NUMBER() OVER (PARTITION BY GRP ORDER BY INDEX DESC) AS RNK    
           FROM (             
                 -- demographics    
                 (    
                   SELECT    
                       D.DX_ID, D.PROJECT_TYPE AS GRP, D.MVS_DESC AS AGG, D.PCT, DIV0NULL(D.PCT, I.PCT) AS INDEX    
                   FROM store_pro_demos_long D    
                     INNER JOIN index_dim A USING(DX_ID)    
                     INNER JOIN index_demos_pct I    
                       ON D.MVS_DESC = I.MVS_DESC    
                       AND IDENTIFIER(baseline) = I.COMP    
                       AND I.BASELINE = baseline    
                   WHERE D.DX_ID = dx_id
                 )
           ) Q1
         ) WHERE RNK = 1
       ) V
$$;

2 Answers2

1

Three level nesting is unnecessary and could be simplified:

SELECT V.name, V.value    
FROM (    
     SELECT name, value    
     FROM (    
       SELECT    
           Q1.GRP AS NAME,    
           Q1.AGG AS VALUE,    
           ROW_NUMBER() OVER (PARTITION BY GRP ORDER BY INDEX DESC) AS RNK    
       FROM (             
             -- demographics    
             (    
               SELECT    
                   D.DX_ID, D.PROJECT_TYPE AS GRP, D.MVS_DESC AS AGG, D.PCT, DIV0NULL(D.PCT, I.PCT) AS INDEX    
               FROM store_pro_demos_long D    
                 INNER JOIN index_dim A USING(DX_ID)    
                 INNER JOIN index_demos_pct I    
                   ON D.MVS_DESC = I.MVS_DESC    
                   AND IDENTIFIER(baseline) = I.COMP    
                   AND I.BASELINE = baseline    
               WHERE D.DX_ID = dx_id
             )
       ) Q1
     ) WHERE RNK = 1
   ) V

To:

SELECT    
    D.PROJECT_TYPE AS GRP, 
    D.MVS_DESC AS AGG
FROM store_pro_demos_long AS D    
JOIN index_dim AS A USING(DX_ID)    
JOIN index_demos_pct AS I    
  ON D.MVS_DESC = I.MVS_DESC    
 AND IDENTIFIER(baseline) = I.COMP    
 AND I.BASELINE = baseline    
WHERE D.DX_ID = dx_id
QUALIFY ROW_NUMBER() OVER(PARTITION BY GRP ORDER BY DIV0NULL(D.PCT, I.PCT) DESC)=1;

Second IDENTIFIER(baseline) could be replaced with CASE expression:

AND IDENTIFIER(baseline) = I.COMP
<=>
AND I.COMP = CASE WHEN baseline = 'col_name1' THEN D.col_name1
                  WHEN baseline = 'col_name2' THEN D.col_name2
                  ELSE D.col_name_default
             END
Lukasz Szozda
  • 162,964
  • 23
  • 234
  • 275
  • Hey @lukasz-szozda. Thanks for taking the time to answer. I already reduced the unnecessary nesting with your suggestion. However, my major problem is with the IDENTIFIER(baseline). If I use a case, I would have to cover all the possible values as part of the CASE. Which I am trying to avoid. – Raul E. Menendez Jun 21 '23 at 22:04
0

I am thinking in using your recommendation of replacing IDENTIFIER function with dbt and Jinja. I will create the UDTF as a dbt model, and then use a pre_hook or a macro to query the different baselines and generate the case dynamically with jinja:

AND IDENTIFIER(baseline) = I.COMP
<=>
AND I.COMP = CASE {%- for b in baselines -%}
                   WHEN baseline = {{b}} THEN D.{{b}}
                  {%- endfor -%}
                   ELSE D.col_name_default
             END