0

I am trying to implement an event source in SQL Server, and have been experiencing deadlocking issues.

In my design, events are grouped by DatasetId and each event is written with a SequenceId, with the requirement that, for a given DatasetId, SequenceIds are serial, beginning at 1 and increasing one at a time with each event, never missing a value and never repeating one either.

My Events table looks something like:

CREATE TABLE [Events]
(
    [Id] [BIGINT] IDENTITY(1,1) NOT NULL,
    [DatasetId] [BIGINT] NOT NULL,
    [SequenceId] [BIGINT] NOT NULL,
    [Value] [NVARCHAR](MAX) NOT NULL
)

I also have a non-unique, non-clustered index on the DatasetId column.

In order to insert into this table with the above restrictions on SequenceId, I have been inserting rows under a transaction using Serializable isolation level, and calculating the required SequenceId manually within this transaction as the max of all existing SequenceIds plus one:

DECLARE @DatasetId BIGINT = 1, @Value NVARCHAR(MAX) = N'I am an event.';

SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
BEGIN TRANSACTION

BEGIN TRY

    DECLARE @SequenceId BIGINT;
    SELECT @SequenceId = ISNULL(MAX([SequenceId]), 0) + 1
    FROM [Events]
    WHERE [DatasetId] = @DatasetId;

    INSERT INTO [Events] ([DatasetId], [SequenceId], [Value])
    VALUES (@DatasetId, @SequenceId, @Value);

    COMMIT TRANSACTION;

END TRY

BEGIN CATCH

    ROLLBACK TRANSACTION;

END CATCH

This has worked fine in terms of the guarantees that I require on the SequenceId column. However, I have been experiencing deadlocking when trying to insert multiple rows in parallel, even when such rows are for different DatasetIds.

The behaviour seems to be that the query to generate a SequenceId in a first connection blocks the same query to generate a SequenceId in the second connection, and this second attempt blocks the first connections ability to insert the row, meaning neither transaction is able to complete, hence the deadlock.

Is there a means of avoiding this whilst still gaining the benefits of a Serializable transaction isolation level?

Another technique I have been considering is reducing the isolation level and instead using sp_getapplock to manually acquire a lock on the table for a given DatasetId, meaning I can then ensure I can generate a consistent SequenceId. Once the transaction has been committed/rolled back, the lock would automatically be released. Is this approach reasonable, or is manually managing locks like this considered an anti-pattern?

  • You might want to read [The Serializable Isolation Level](https://sqlperformance.com/2014/04/t-sql-queries/the-serializable-isolation-level) - because it doesn't stop concurrent queries operating concurrently, it only specified what the results must be. Therefore it doesn't promise to avoid deadlocks. – Dale K Sep 10 '19 at 09:07
  • Given you are using `MAX([SequenceId])` you really want to be the exclusive user of the table while you are getting the value and while you are setting the value. Therefore taking an exclusive table lock for the transaction is an option, as is using `sp_getapplock` – Dale K Sep 10 '19 at 09:10
  • Although last time I tried using `sp_getapplock` in such a situation, it gave me deadlocks. So I went back to an exclusive table lock. – Dale K Sep 10 '19 at 09:11
  • Seems you missed the DatasetId=@ DatasetId condition in SELECT @ SequenceId = ISNULL(MAX([SequenceId]), 0) + 1 FROM [Events]; Also I think read committed is good enough for your scenario. YOu just need to add the hint HOLDLOCK in the SELECT for the @ SequenceId – PeterHe Sep 10 '19 at 13:41
  • Also you need to change the isolation level before starting the trx. – PeterHe Sep 10 '19 at 13:42
  • Thanks @PeterHe, I've fixed in the example code. The example above was not the actual code however, and these issues do not exist in the actual code so are not responsible for the deadlocking. – Alex Dawes Sep 11 '19 at 11:01
  • @DaleBurrell I had a mistake in my original question which has now been fixed - because the query to calculate the `SequenceId` filters by `DatasetId`, the lock taken should in theory (from my understanding) only be taken on the single `DatasetId` owing to the presence of the non-clustered index on this column. – Alex Dawes Sep 11 '19 at 11:03
  • @DaleBurrell I should also make clear that I understand that Serializable transactions do not guarantee to avoid deadlocking, but I dont understand where the deadlock is coming from, given the structure of the query and the expected behaviour of the serializable transaction locking. I am performing a read (using the aggregate `MAX` function) and then a write on a single table, and so I'm not sure how two queries can deadlock here. If the first query obtains a lock it should stop the second query from doing so, and so the first query should be able to complete without issue. – Alex Dawes Sep 11 '19 at 11:05
  • @DaleBurrell I am assuming there is something going on here with lock level escalation, but my SQL-fu is not strong enough to understand the details involved. – Alex Dawes Sep 11 '19 at 11:06
  • @PeterHe I will try the suggestion using READ COMMITTED and HOLDLOCK hint and see if that works for me. Thanks! – Alex Dawes Sep 11 '19 at 11:07
  • Pull the deadlock graph and inspect exactly what order and what level the locks being taken are (you'll need to google accessing the deadlock graph). And if you need help deciphering it, I suggest posting over at https://dba.stackexchange.com/ as they are the experts. – Dale K Sep 11 '19 at 12:09

0 Answers0