1

Some time ago I ran into the race condition issue when 2 separate transactions try to simultaneously check if the record exists(by 4 fields) and if no - create a new one.

My environment:

MS SQL Server, Spring Data/JPA/Hibernate

It was a duplicate records issue. I implemented the test that simulates concurrent calls and thus was able(pretty stable at 99.99% of execution times) to reproduce this issue.

Right now I'm fixed this issue by applying unique constraint over these 4 fields. At this moment my test is unable to reproduce this issue. I'm really happy with it but honestly, I don't fully understand how it works under the hood. This is exactly why I created this question - I don't understand why for example my test doesn't fail with a ConstraintViolationException while both of the concurrent transactions simultaneously check for record presence and then try to create it(each of them). In my understanding according to my test - both of the transactions work simultaneously and shouldn't find any record during the first check. After that, they should try to create new records and one of the transactions should be able to do it and another one should fail with ConstraintViolationException. But the code works fine and everything is working without any exceptions.

Is there any internal synchronization mechanisms at Spring Data/JPA/Hibernate level or even MS SQL Server that prevents concurrent transactions from incorrect work and allow them to wait for the results of work from each other? Please explain. Thanks !

alexanoid
  • 24,051
  • 54
  • 210
  • 410

1 Answers1

2

Short Answer

It all depends on indexes you have on the table and the indexes being used by your queries at the time. If different execution plans were used by different processes to check for the row's non-existence then they would both return true and both add a record due to the way SQL Server issues resource locks.

From [1]:

  1. When performing a write operation, SQL Server does not lock related indexes ..., only the relevant data row.
  2. When performing a read operation, SQL Server locks only the objects (e.g. indexes, data rows etc') that it found and used within its access path.

By adding a unique constraint on these 4 fields you implicitly and effectively added a covering index, which resulted in all of your processes using the same query plan and therefore acquiring resource locks in the same order on the same objects.

Long Answer

Data mismatch when querying with different indexes

For very detailed info about resource locking read this: [1] https://www.mssqltips.com/sqlservertip/1485/using-sql-server-indexes-to-bypass-locks/

Alex
  • 4,885
  • 3
  • 19
  • 39