0

I use structured streaming and Delta to keep two tables (A -> B) in sync. Rather than a continuous streaming job, I use the trigger AvailableNow to run the update once a day only. B has a checkpoint tracking the progress from A.

When starting a batch, the sync is done automatically like magic, starting from where it stopped. Everything is running as expected. However, I would like to know from which version of A the batch is starting.

For example: A is at version 10, and B is in sync with version 7 of 'A'. This means B needs to ingest versions 8, 9 and 10 of A.

It's fairly easy to know the latest version of A using DeltaTable.history. However, how to extract the information from B's checkpoint?

Alex Ott
  • 80,552
  • 8
  • 87
  • 132
pgrandjean
  • 676
  • 1
  • 9
  • 19

0 Answers0