0

Is it possible to view the latest delta table commit version via a Synapse serverless pool? I require this column downstream for incremental loads. This is easy to retrieve via Spark with the option readChangeFeed, however, I would like to expose this column via Synapse serverless.

I have tried reading the change feed and then persisting that back onto the table as a new column, however, that also counts as a commit so it's not ideal. I have also thought of creating a new Synapse serverless table with the commit history, but then Im doubling the tables and records (although that could just be PK and the last commit version).

Is there a better way of exposing this via Synapse serverless?

Will W
  • 95
  • 6
  • Reading the synapse serverless doco, it's all done through `OPENROWSET`. It looks like you can only get this info through spark pools. – Nick.Mc Aug 07 '23 at 01:41

1 Answers1

0

Unless a timestamp column is added, Synapse Serverless SQL currently Do NOT directly provide incremental changes functionality. To obtain the most recent version of the delta table that contains the most recent snapshot of the table, you may utilise a Spark notebook and the time travel capability of Delta.

Know more from the microsoft documentation. Query Delta Lake files using serverless SQL pool in Azure Synapse Analytics

The below syntax will give you up-to-date information from the delta table:

SELECT TOP 10 *
FROM OPENROWSET(
    BULK 'https://<yourstorageaccountname>.dfs.core.windows.net/<containername>/directory/',
    FORMAT = 'delta') as rows;

Old Data:
enter image description here
New data: enter image description here

In my source I performed 5 Sql statements 2 rows inserted, 1 row updated, 1 row deleted, and then row inserted, Query against that delta table would return it’s the latest version Means the most up-to-date information.