2

I'm having issues when running the command MERGE INTO on Snowflake. The data is located in a bucket on S3. The files format are .snappy.parquet.

The command runs well, it identifies the files in S3, but it loads only NULL values to the table. The total row numbers are also good.

I confirmed that @myExternalStageToS3 points to the right location by running a query which returned the expected values:

SELECT
    $1:DAY,
    $1:CHANNEL_CATEGORY,
     $1:SOURCE,
    $1:PLATFORM,
    $1:LOB
@myExternalStageToS3
(FILE_FORMAT => 'sf_parquet_format')

As it is a new table with no records, the condition uses INSERT.

MERGE INTO myTable as target USING
(
    SELECT
    $1:DAY,
    $1:CHANNEL_CATEGORY,
    $1:SOURCE,
    $1:PLATFORM,
    $1:LOB
    FROM @myExternalStageToS3
    (FILE_FORMAT => 'sf_parquet_format')
  ) as src 
        ON target.CHANNEL_CATEGORY = src.$1:CHANNEL_CATEGORY 
        AND target.SOURCE = src.$1:SOURCE 
WHEN MATCHED THEN
UPDATE SET 
DAY= src.$1:DAY
,CHANNEL_CATEGORY= src.$1:CHANNEL_CATEGORY
,SOURCE= src.$1:SOURCE
,PLATFORM= src.$1:PLATFORM
,LOB= src.$1:LOB
WHEN NOT MATCHED THEN
INSERT
(
      DAY,
      CHANNEL_CATEGORY,
      SOURCE,
      PLATFORM,
      LOB
) VALUES
(
  src.$1:DAY,
  src.$1:CHANNEL_CATEGORY,
  src.$1:SOURCE,
  src.$1:PLATFORM,
  src.$1:LOB
);

The sf_parque_format was created with these details:

create or replace file format sf_parquet_format
    type = 'parquet'
    compression = auto;

Do you have any idea what am I missing?

ultraInstinct
  • 4,063
  • 10
  • 36
  • 53

1 Answers1

1

The query inside USING part was altered(data type casts and aliases):

MERGE INTO myTable as target USING (
    SELECT
    $1:DAY::TEXT AS DAY,
    $1:CHANNEL_CATEGORY::TEXT AS CHANNEL_CATEGORY,
    $1:SOURCE::TEXT AS SOURCE,
    $1:PLATFORM::TEXT AS PLATFROM,
    $1:LOB::TEXT AS LOB
    FROM @myExternalStageToS3
    (FILE_FORMAT => 'sf_parquet_format')
  ) as src 
        ON target.CHANNEL_CATEGORY = src.CHANNEL_CATEGORY 
        AND target.SOURCE = src.SOURCE 
WHEN MATCHED THEN
UPDATE SET 
 DAY= src.DAY
,PLATFORM= src.PLATFORM
,LOB= src.LOB
WHEN NOT MATCHED THEN
INSERT  (
      DAY,
      CHANNEL_CATEGORY,
      SOURCE,
      PLATFORM,
      LOB
) VALUES  (
  src.DAY,
  src.CHANNEL_CATEGORY,
  src.SOURCE,
  src.PLATFORM,
  src.LOB
);

The UPDATE part does not require ,CHANNEL_CATEGORY= src.CHANNEL_CATEGORY ,SOURCE= src.SOURCE as condition is already met by ON clasue.

Lukasz Szozda
  • 162,964
  • 23
  • 234
  • 275