When running 2 concurrent updates on the same partition of an Iceberg table using Spark, I get the following error: Found new conflicting delete files that can apply to records matching ...
. The updates are on two different entries in the partition (the entries are even in different data files). Furthermore, I have set the Isolation Level to snapshot
.
Based on the experiments I have carried out, the serializable
isolation level seems to be behaving in exactly the same way as snapshot
.
All of this is confusing because the docs state that Iceberg supports concurrent writes using optimistic locking: https://iceberg.apache.org/docs/1.2.1/reliability/#concurrent-write-operations.
It even states the following: Writers avoid expensive retry operations by structuring changes so that work can be reused across retries.
I can confirm that inserts to the same partition and updates on different partitions work as expected.
Is this the expected behaviour?