We are trying to delete data from a delta lake using a AWS Glue Job. Please suggest why the merge condition is not working for delete.
This works fine if my delete_condition is like
changes.flag = True
However it is not performing any deletes if the delete_condition is like
source.date_field > date_sub(current_date(),7)
Also, it works fine if I use direct deletes in place of Merge
delta_source.delete(date_field > date_sub(current_date(),7))
and the merge part of code is:
delta_source = DeltaTable.forPath(spark, f"{delta_path}")
delta_merger_0 = delta_source.alias("source").merge(
latest_change_for_each_key.alias("changes"), insert_command
)
delta_merger_1 = delta_merger_0.whenMatchedDelete(
condition=delete_condition
)
delta_merger_0 = delta_merger_1
delta_merger_0.whenMatchedUpdateAll().whenNotMatchedInsertAll().execute()
delta_source is the delta lake source latest_change_for_each_key is the incremental records data frame