I cannot understand the following piece of codes. it seems that upserts CDC events applied before delete CDC events.
# invoke hudi_write function for upserts
if df_w_upserts and df_w_upserts.count() > 0:
hudi_write(
df=df_w_upserts,
schema="schema_name",
table="table_name",
path=path,
mode="append",
hudi_options=hudi_options
)
# invoke hudi_write function for deletes
if df_w_deletes and df_w_deletes.count() > 0:
hudi_options_copy = copy.deepcopy(hudi_options)
hudi_options_copy["hoodie.datasource.write.operation"] = "delete"
hudi_options_copy["hoodie.bloom.index.update.partition.path"] = False
hudi_write(
df=df_w_deletes,
schema="schema_name",
table="table_name",
path=path,
mode="append",
hudi_options=hudi_options_copy
)
My question is: how to keep the cdc delete/upsert order and apply them? How about one record got deleted than inserted?
Thanks in advance.