Conflicting delete files error when running concurrent updates on an Iceberg table

Asked Jul 10 '23 at 21:00

Active Jul 10 '23 at 22:53

Viewed 57 times

When running 2 concurrent updates on the same partition of an Iceberg table using Spark, I get the following error: Found new conflicting delete files that can apply to records matching .... The updates are on two different entries in the partition (the entries are even in different data files). Furthermore, I have set the Isolation Level to snapshot.

Based on the experiments I have carried out, the serializable isolation level seems to be behaving in exactly the same way as snapshot.

All of this is confusing because the docs state that Iceberg supports concurrent writes using optimistic locking: https://iceberg.apache.org/docs/1.2.1/reliability/#concurrent-write-operations. It even states the following: Writers avoid expensive retry operations by structuring changes so that work can be reused across retries.

I can confirm that inserts to the same partition and updates on different partitions work as expected.

Is this the expected behaviour?

edited Jul 10 '23 at 22:53

asked Jul 10 '23 at 21:00

CS1999

Conflicting delete files error when running concurrent updates on an Iceberg table

0 Answers0