7

There is lots of information in the internet regarding this common "problem".

Solutions like:

IF NOT EXISTS() BEGIN INSERT INTO (...) END

are not thread-safe in my opinion and you will probably agree.

However could you confirm that putting the exist into the where clause of one single select would solve the problem of the highest concurrency in sql engine? Is it enough?

insert into Table (columns)
select column1, column2, column3
where not exists (select top 1 1 from Table where something)

Should be there also added some higher transaction level or can this be executed on a default one: committed?

Would this work under uncommitted level?

Thanks!

//Added later

Can i assume that both sql' are correct:

1) set transaction isolation level repeatable read

   IF NOT EXISTS() BEGIN INSERT INTO (...) END

2) set transaction isolation level repeatable read

insert into Table (columns)
select column1, column2, column3
where not exists (select top 1 1 from Table where something)
marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Paul
  • 219
  • 3
  • 10
  • 1
    No it won't work at either readcommitted or uncommitted level. You would need some additional locking hints. possible duplicate of [Only inserting a row if it's not already there](http://stackoverflow.com/questions/3407857/only-inserting-a-row-if-its-not-already-there) – Martin Smith May 25 '11 at 07:24
  • 1
    You don't need to "help out" exists clauses - they're smart enough to finish after they've seen 1 row. you can just do `EXISTS (SELECT * FROM...` and it does the right thing. – Damien_The_Unbeliever May 25 '11 at 07:26
  • it will do the right thing, but in case of the cocurrency it is not thread-safe enough and primary violation errors may occur under high load. So according to the link of @Martin, repeatable read isolation transaction should be added. – Paul May 25 '11 at 07:35
  • @Paul - I meant you don't need to do `select top 1...` in an exists clause - I wasn't commenting on the rest of the code. – Damien_The_Unbeliever May 25 '11 at 07:37
  • @Damien_The_Unbeliever Aaa ok sorry for misunderstanding. I have always thought that writing: "top 1 1" doesn't make the server to fetch values of the columns.. – Paul May 25 '11 at 07:41
  • @Paul - back in, I think, SQL Server 6.5 or so, it used to matter what columns you have in the select list for an `EXISTS` predicate. The optimizer has gotten a lot smarter since then, and specifically optimizes for `SELECT *` (It knows that the rows/columns aren't actually going to be accessed) – Damien_The_Unbeliever May 25 '11 at 07:49
  • @Damien_The_Unbeliever, @Paul: Just try `NOT EXISTS (SELECT 1/0 FROM...)` and also see page 191 on ANSI 92, case 3a http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt – gbn May 25 '11 at 07:53
  • @gbn - yes, if you have an expression of some kind, it has to evaluate it (since the expression might return null). So any non-* expression is potentially more expensive. – Damien_The_Unbeliever May 25 '11 at 09:46
  • @Damien - You are presumably thinking of `COUNT(*)`. `NULL` makes no difference to `EXISTS`. You can use the expression `SELECT NULL` also. gbn's divide by zero example does not raise any errors. Though `SELECT COUNT(1/0)` doesn't raise any errors either actually. – Martin Smith May 25 '11 at 09:53
  • @Martin, @gbn - it's a little long for a comment, and won't format right, but this prints "okay then": `create table T ( ID int not null, Val1 as 10/ID ) go insert into T(ID) select 0 union all select 1 union all select 2 go if exists(select * from T) begin print 'Okay then' end else begin print 'oh dear' end` – Damien_The_Unbeliever May 25 '11 at 09:57
  • I could swear I had the `NOT EXISTS (select 1/0` producing an error a short time ago - I'll try to reproduce again – Damien_The_Unbeliever May 25 '11 at 10:00
  • `SELECT COUNT(1/0) FROM master..spt_values` is a bit strange actually. With `SET ANSI_WARNINGS OFF; SET ARITHABORT OFF;` it correctly returns `0` as the expression evaluates to `NULL`, Otherwise it returns the number of rows so it is kind of evaluated. – Martin Smith May 25 '11 at 10:05
  • @Martin: COUNT(1/0) with a GROUP BY will raise an error though. Without, it doesn't – gbn May 25 '11 at 10:06
  • @gbn - my slight mistake - I can get the `1/0` to cause on error on SQL Server 2000, but not on 2008. I'd though you were raising a point about other expressions being evaluated - serves me right for not checking which server I was using. – Damien_The_Unbeliever May 25 '11 at 10:22
  • @Damien - Backing up your original point about using `TOP 1` in `NOT EXISTS` Conor Cunningham specifically warns that this can be harmful [here](http://blogs.msdn.com/b/conor_cunningham_msft/archive/2010/05/14/conor-vs-anti-semi-join-reordering.aspx) as it limits the transformations considered. – Martin Smith May 26 '11 at 18:19

2 Answers2

6

With TRY/CATCH you can avoid the extra read

BEGIN TRY
   INSERT etc
END TRY
BEGIN CATCH
    IF ERROR_NUMBER() <> 2627
      RAISERROR etc
END CATCH
  • A NOT EXISTS will read the table, whether in the IF or WHERE
  • The INSERT requires a read to check uniqueness

If you can discard duplicates, this is a highly scalable technique

Links:

Community
  • 1
  • 1
gbn
  • 422,506
  • 82
  • 585
  • 676
  • Well ok, but if i Should insert or update then: Could I do: if error_number() = 2627 begin update ... end? – Paul May 25 '11 at 07:52
  • @Paul: see my last link then for dealing with UPDATEs – gbn May 25 '11 at 07:53
  • Is it possible to raise default error in begin catch block? or Do I have to explixity get the error_number and the message to satisfy raiserror parameters? – Paul May 25 '11 at 09:28
  • @Paul: You'd have to read ERROR_NUMBER() and ERROR_MESSAGE() etc and build your own message. [Example](http://stackoverflow.com/questions/1882788/sql-server-rethrow-exception-with-the-original-exception-number/1883148#1883148). However, the number then becomes 50000. The ability to rethrow the same error is added next version: [THROW](http://msdn.microsoft.com/en-us/library/ee677615%28SQL.110%29.aspx) – gbn May 25 '11 at 09:39
1

To answer the updated question repeatable read would still not be sufficient.

It is holdlock / serializable level that you would need.

You are trying to prevent phantoms (where on the first read no rows met the criteria so the NOT EXISTS returns true but subsequently a concurrent transaction inserts a row meeting it)

Martin Smith
  • 438,706
  • 87
  • 741
  • 845
  • It seems that I will start being paranoic when writing sql statements :) – Paul May 25 '11 at 08:09
  • @Paul - `serializable` would also give you the risk of deadlock, This could be avoided with `UPDLOCK, HOLDLOCK`. I would use `TRY...CATCH` myself though. – Martin Smith May 25 '11 at 08:18