Insert Concurrency Issue - Multithreaded Environment

Question

I have an issue that the same stored procedure is being invoked at exactly the same time with exactly the same paramenters.

The purpose of the stored procedure is to fetch a record if it exists or to create and fetch the record if it does not exist.

The problem is that both threads are checking the existence of the record and reporting false and then both are inserting a new record, creating a duplicate in the database.

I tried sticking the operations inside a transaction but this merely produced hundreds of deadlocks.

Is there any way that I can check for the existence of the record in a thread safe manner so that the second thread won't do its read until the first has finished its insert? I have no control over the threads themselves, only over the stored procs they are executing.

Any help would be appreciated,

Thanks.

It produced deadlocks, what exactly are these transactions doing; they must not be a simple as you are stating. What RDBMS are you using? — BobbyShaftoe, Jun 24 '10 at 05:01
On the contrary, the procedure is very simple - it checks whether the record already exists, creates it if it doesn't and then selects the record. The only reason we have an issue is that there are many concurrent threads (up to 200). — DJCasey, Jun 24 '10 at 05:41
See http://stackoverflow.com/questions/6120819/sql-server-2005-insert-if-not-exists/6121089#6121089 as well. — Jarrod Dixon, Sep 08 '11 at 03:23

Panagiotis Kanavos · Answer 1 · 2010-06-24T10:59:25.663

8

The trick is to add a WHERE to your INSERT statement so that INSERT only works if the item does not exist, followed by the SELECT statement. Supposing that the record can be identified by an ID column you would write:

INSERT INTO MyTable (ID,Col1,Col2,...) 
SELECT @IDValue,@Col1Value,@Col2Value, ...
WHERE NOT EXISTS (SELECT ID  
              FROM MyTable 
              WHERE ID=@IDValue) 

SELECT *  
FROM MyTable 
Where ID=@IDValue

You don't need to put the statements in a transaction because each statement is executed in its own implicit transaction. Thus, there is no way that two INSERTS will succeed at the same time.

EDIT: The INSERT ... SELECT syntax is necessary because TSQL doesn't allow a VALUES and a WHERE part in the INSERT statement.

edited Jun 24 '10 at 10:59

answered Jun 24 '10 at 10:26

Panagiotis Kanavos

120,703
13
188
236

I wonder if you should also add WITH (UPDLOCK, HOLDLOCK) to the inner select statement? – Mladen Mihajlovic Nov 08 '13 at 09:08
1

Whatever for? The operation is atomic. HOLDLOCK will only keep the locks in exactly the same way a transaction would, until either the connection closes or the encompassing transaction (if there was one) completes. In fact, HOLDLOCK is worse than an explicit transaction because you can't control the locks anymore until the connection closes – Panagiotis Kanavos Nov 08 '13 at 10:17

score 0 · Answer 2 · answered Aug 17 '12 at 14:55

0

I assume you are using c# to communicate with the sql server then you can try looking into task parallelism and task library for multithreading the stored procedures.

answered Aug 17 '12 at 14:55

rushabh25

31
3

Its highly unlikely that changing the technique for accessing an external resource will result in solving a concurrency issue. It also may invalid if it were since we're not sure what the client is. – Conrad Frix Sep 28 '12 at 22:01

mmmmmm · Answer 3 · 2010-06-24T08:15:36.873

Trouble is in doing a select and then insert then normally there is a read lock for the select and then a write lock on the insert. Without a transaction then the timing of many updates will often allow multiple inserts to happen as you see. In a transaction the first read lock will stop other processes getting a write lock and if more than one process gets a read lock then none can get a write lock and so you get a deadlock.

In this case I would alter the insert code so that the indices only allow one insert to work ie you have a unique key and only one process will ne able to insert the data so getting no duplicates. The process for updating is then in a transaction either

1) do the insert first and deal with an exception or error if it tries inserting a duplicate

or 2) do a HOLD LOCK (Sybase and SQL Server)on doing the select first - thus the first to lock gets the full permission to insert if needed

or 3) Possibly use the merge command if the RDBMS allows it. This does the check and insert all in one command but will always change the database.

EDIT: I think there is no real alternative to 1 if you need to make sure there is one and only one record inserted as the test for that has to be in a transaction.

the cost could be cut by checking for existence forst in one transaction and then secondly doing the insert and check in another transaction. Thus in most cases you just have a select and in the other cases you get the full slow insert and check but this should occur less often.

I had already tried the first solution but it slowed the process too much - have also tried a hold lock but this did not fix the problem since both threads already have a shared lock on the table and neither can elevate. Not sure about the merge command but if it always modifies the database then it is not what I want. — DJCasey, Jun 24 '10 at 06:26

score 0 · Answer 4 · answered Jun 24 '10 at 10:06

0

Not sure if SQL Server has it. But in MySQL and in oracle you can get write lock while doing a select using for update syntax.

select * 
from table 
for update

Since other threads also need write lock while doing select, they will wait till first thread completes the transaction.

answered Jun 24 '10 at 10:06

Reddy

8,737
11
55
73

This has the same effect as a transaction - it creates locks on the table. This particular syntax, without a where statements, will actually lock the entire table. SQL Server doesn't have this syntax but even with Oracle this syntax results in delays compared to simpler versions. – Panagiotis Kanavos Jun 24 '10 at 10:12

Insert Concurrency Issue - Multithreaded Environment

4 Answers4

Linked