for one of my use cases I am using change data feed (CDF) feature of delta lake,it goes all well with CDF but when i read all the data to insert in gold, it lists all the versions, is there a way i can read only the latest version without specifying version number or a way to fetch latest version ?
return spark.read.format("delta") \
.option("readChangeFeed", "true") \
.table(tableName) \
.where(col("_change_type") != "preimage")
above code block returns results from all versions since start, i can fetch only latest data by looking into the table and specifying the version but i don't understand how to enable this in production, I don't want to use timestamp to fetch the latest version as in case of retries some one might run the pipeline multiple times a day and this will bring data inaccuracies if not handled as 1st run of the day. Any help would be appreciated.