How to append the dataframe data into dedicated SQL pool through Spark pool of synapse?

Question

How to append the data frame data into dedicated SQL pool through Spark pool of synapse?

score 0 · Answer 1 · answered Oct 21 '21 at 12:08

The synapsesql.read method (described here) does not currently support appending to a dedicated SQL pool table but you can simulate that behaviour with either partition switching or use CTAS to combine your original table and the new data.

Partitioning pattern:

CTAS pattern:

CREATE TABLE dbo.yourNewTable
WITH (
    DISTRIBUTION = ROUND_ROBIN, -- or HASH(someColumn)|REPLCIATE
    CLUSTERED COLUMNSTORE INDEX -- or CLUSTERED INDEX|HEAP
)
AS
SELECT *
FROM dbo.someTable_part1
UNION ALL
SELECT *
FROM dbo.someTable_part2;

You may be able to do append with a JDBC connection but I probably would not do that in Synapse. You could also try saveAsTable method, bearing in mind this appends to a Spark table not a dedicated SQL pool table, something like:

df.write.saveAsTable('yourTable', mode='append')

Append got added recently, see here: https://learn.microsoft.com/en-us/azure/synapse-analytics/spark/synapse-spark-sql-pool-import-export#supported-dataframe-save-modes — wBob, May 17 '22 at 23:57

How to append the dataframe data into dedicated SQL pool through Spark pool of synapse?

1 Answers1