0

I have a simple requirement to write a dataframe from spark (databricks) to a synapse dedicated pool table and keep refreshing (truncating) it on daily basis without dropping it.

Documentation suggests to use truncate with overwrite mode but that doesn't seem to be working as expected for me. As i continue to see table creation date getting updated

I am using

df.write \
      .format("com.databricks.spark.sqldw") \
      .option("url", synapse_jdbc) \
      .option("tempDir", tempDir) \
      .option("useAzureMSI", "true") \
      .option("dbTable", table_name) \
      .mode("overwrite") \
      .option("truncate","true") \
      .save()

But there doesn't seem to be any difference whether i use truncate or not. Creation date/time of the table in synapse gets updated everytime i execute the above from databricks. Can anyone please help with this, what am i missing?

I already have a workaround which works but seems more like a workaround

.option("preActions", "truncate table "+table_name) \
.mode("append") \

1 Answers1

0

I tried to reproduce your scenario in my environment and the truncate is not working for me with the synapse connector.

While researching this issue I found that not all Options are supported to synapse connector In the Official Microsoft document the provided the list of supported options like dbTable, query, user, password, url, encrypt=true, jdbcDriver, tempDir, tempCompression, forwardSparkAzureStorageCredentials, useAzureMSI , enableServicePrincipalAuth, etc.

Truncate ate option is supported to the jdbc format nots synapse connector.

When I change the format form com.databricks.spark.sqldw to jdbc it's working fine now.

My Code:

df.write.format("jdbc")
    .option("url",synapse_jdbc)
    .option("forwardSparkAzureStorageCredentials", "true")
    .option("dbTable", table_name)
    .option("tempDir", tempdir)
    .option("truncate","true")
    .mode("overwrite")
    .save()

First execution:

enter image description here

enter image description here

Second execution:

enter image description here

enter image description here

conclusion: For both the time when code is executed table creation time is same means overwrite is not dropping table it is truncating table

Pratik Lad
  • 4,343
  • 2
  • 3
  • 11