1

I have a table like following Qubole:

use dm;

CREATE EXTERNAL TABLE IF NOT EXISTS fact (
    id string,
    fact_attr struct<
        attr1 : String,
        attr2 : String
    >
)
STORED AS PARQUET
LOCATION 's3://my-bucket/DM/fact'

I have created parallel table in Snowflake like following:

CREATE TABLE IF NOT EXISTS dm.fact (
    id string,
    fact_attr variant
)

My ETL process loads the data into qubole table like:

+------------+--------------------------------+
| id         | fact_attr                      |
+------------+--------------------------------+
| 1          | {"attr1": "a1", "attr2": "a2"} |
| 2          | {"attr1": "a3", "attr2": null} |
+------------+--------------------------------+

I am trying to sync this data to snowflake using Merge command, like

MERGE INTO DM.FACT dst USING %s src 
    ON dst.id = src.id
WHEN MATCHED THEN UPDATE SET
    fact_attr = parse_json(src.fact_attr)
WHEN NOT MATCHED THEN INSERT (
    id,
    fact_attr
) VALUES (
    src.id,
    parse_json(src.fact_attr)
);

I am using PySpark to sync the data:

df.write \
  .option("sfWarehouse", sf_warehouse) \
  .option("sfDatabase", sf_database) \
  .option("sfSchema", sf_schema) \
  .option("postactions", query) \
  .mode("overwrite") \
  .snowflake("snowflake", sf_warehouse, sf_temp_table)

With above command I am getting following error:

pyspark.sql.utils.IllegalArgumentException: u"Don't know how to save StructField(fact_attr,StructType(StructField(attr1,StringType,true), StructField(attr2,StringType,true)),true) of type attributes to Snowflake"

I have read through following links but no success:

Question:

How can I insert/sync data from Qubole Hive table which has STRUCT field to snowflake?

Community
  • 1
  • 1
Ambrish
  • 3,627
  • 2
  • 27
  • 42

1 Answers1

0

The version of your Spark Connector for Snowflake in use at the time of trying this lacked support for variant data types.

Support was introduced in their connector version 2.4.4 (released July 2018) onwards, where the StructType fields are now auto-mapped to a VARIANT data type that will work with your MERGE command.

Harsh J
  • 666
  • 4
  • 7