I use structured streaming and Delta to keep two tables (A -> B
) in sync. Rather than a continuous streaming job, I use the trigger AvailableNow
to run the update once a day only. B
has a checkpoint tracking the progress from A.
When starting a batch, the sync is done automatically like magic, starting from where it stopped. Everything is running as expected. However, I would like to know from which version of A
the batch is starting.
For example: A
is at version 10, and B
is in sync with version 7 of 'A'. This means B
needs to ingest versions 8, 9 and 10 of A
.
It's fairly easy to know the latest version of A
using DeltaTable.history
. However, how to extract the information from B
's checkpoint?