5

I'm migrating a queue in disk to in memory SQL Server 2016 to implement a queue.

This is my queue format:

CREATE TABLE dbo.SimpleQueue
(
   MsgId BIGINT NOT NULL PRIMARY KEY NONCLUSTERED IDENTITY(1, 1),
   Payload VARCHAR(7500) NOT NULL,
   IsDeleted BIT NOT NULL
) WITH (MEMORY_OPTIMIZED=ON)
GO

This is my Enqueue native SQL Server stored procedure:

CREATE PROCEDURE dbo.Enqueue(@Payload VARCHAR(7500), @IsDeleted BIT)
WITH NATIVE_COMPILATION, SCHEMABINDING, EXECUTE AS OWNER
AS BEGIN ATOMIC WITH
(TRANSACTION ISOLATION LEVEL = SNAPSHOT, LANGUAGE = 'english')

  INSERT INTO dbo.SimpleQueue (Payload, IsDeleted) VALUES (@Payload, @IsDeleted); 

END
GO

I'm trying to write down the Dequeue native SQL Server stored procedure, but I'm having some difficulties on how to implement an UPDATE using results of a SELECT or a variable table.

So far I tried:

CREATE PROCEDURE dbo.Dequeue(@BatchSize INT = 1)
WITH NATIVE_COMPILATION, SCHEMABINDING, EXECUTE AS OWNER
AS BEGIN ATOMIC WITH
( TRANSACTION ISOLATION LEVEL = SNAPSHOT,LANGUAGE = 'english' )
   UPDATE dbo.SimpleQueue
        SET IsDeleted=1
        WHERE MsgId = (
            SELECT TOP(@BatchSize) MsgId, Payload
                FROM dbo.SimpleQueue
                WHERE IsDeleted = 0)
END
GO

But I get this error:

Subqueries (queries nested inside another query) is only supported in SELECT statements with natively compiled modules.

So I tried a different approach by using a variable to store the result.

First I created a Table type:

CREATE TYPE dbo.SimpleDequeue
  AS TABLE 
   (
    MsgId BIGINT NOT NULL PRIMARY KEY NONCLUSTERED, 
    Payload INT NOT NULL
   )
   WITH (MEMORY_OPTIMIZED=ON)
GO

So far so good, then I tried to use it:

CREATE PROCEDURE dbo.Dequeue(@BatchSize INT = 1)
WITH NATIVE_COMPILATION, SCHEMABINDING, EXECUTE AS OWNER
AS BEGIN ATOMIC WITH
( TRANSACTION ISOLATION LEVEL = SNAPSHOT, LANGUAGE = 'english')

    DECLARE @result dbo.SimpleDequeue;

    INSERT @result 
        SELECT TOP(@BatchSize) MsgId, Payload FROM dbo.SimpleQueue
        WHERE IsDeleted = 0

    UPDATE dbo.SimpleQueue 
        SET IsDeleted = 1 
        WHERE 
            @result.MsgId = dbo.SimpleQueue.MsgId

    SELECT MsgId, Payload FROM @result
END
GO

I get this error:

Must declare the scalar variable "@result".

(only when is using @result on WHERE @result.MsgId = dbo.SimpleQueue.MsgId)

Here is the old dequeue process using in disk SQL Server tables:

CREATE PROCEDURE dbo.DequeueInDisk
    @BatchSize INT = 1
AS
BEGIN
    SET NOCOUNT ON;
    WITH 
    cte AS (
        SELECT TOP(@BatchSize) Payload
        FROM dbo.SimpleQueue WITH (ROWLOCK, READPAST)
        ORDER BY MsgId
    )
    DELETE FROM cte OUTPUT deleted.Payload;
END

How can I make that UPDATE and OUTPUT the updated values (with high performance, since this is critical)?

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
João Antunes
  • 798
  • 8
  • 27

2 Answers2

2

I think your approach makes perfect sense from SQL development point of view - you have to think in sets rather than in row-by-row approach. But it looks like Microsoft thinks that you require different approach for native compiled procedures, more imperative and really row-by-row (see Implementing UPDATE with FROM or Subqueries or Implementing MERGE Functionality in a Natively Compiled Stored Procedure. So your procedure can look like this:

create or alter procedure [dbo].[Dequeue](@BatchSize int = 1)
with native_compilation, schemabinding, execute as owner
AS
BEGIN ATOMIC WITH (TRANSACTION ISOLATION LEVEL = SNAPSHOT, LANGUAGE = 'english')
    declare
        @result dbo.SimpleDequeue;

    declare
        @MsgId int,
        @Payload varchar(7500),
        @i int = 0;

    while @i < @BatchSize
    begin
        select top (1)
            @MsgId = s.MsgId,
            @Payload = s.Payload
        from dbo.SimpleQueue as s
        where
            s.IsDeleted = 0
        order by
            s.MsgId;

        if @@rowcount = 0
        begin
            set @i = @BatchSize;
        end
        else
        begin
            update dbo.SimpleQueue set IsDeleted = 1 where MsgId = @MsgId;

            insert into @result (MsgId, Payload)
            select @MsgId, @Payload;

            set @i += 1;
        end;
    end;

    select MsgId, Payload from @result;
END

I've not tested how fast it will work, but I'll definitely will test it with some real numbers, cause we have a couple of these table-queues implemented and I wonder if we can get some performance boost with Hekaton.

Roman Pekar
  • 107,110
  • 28
  • 195
  • 197
-2

In your old routine you use the TOP(@BatchSize) with an ORDER BY MsgId. The new approach seems not to have this ORDER BY... You'll get random result...

Your

WHERE MsgId = (
            SELECT TOP(@BatchSize) MsgId, Payload
                FROM dbo.SimpleQueue
                WHERE IsDeleted = 0
                /*added this!*/ ORDER BY MsgId  )

will come back with two columns and - probably - several rows. You cannot compare this with an "=".

What you can try:

WHERE MsgId IN (
            SELECT TOP(@BatchSize) MsgId
                FROM dbo.SimpleQueue
                WHERE IsDeleted = 0
                ORDER BY MsgId)

Or you could try to use an INNER JOIN, something like this:

 UPDATE dbo.SimpleQueue
        SET IsDeleted=1
   FROM dbo.SimpleQeueu
   INNER JOIN dbo.SimpleQueue AS sq ON dbo.SimpleQeueu.MsgId=sq.MsgId
                                       AND sq.IsDeleted=0
                                       --this is missing the TOP-clause

What else: You could try an INNER JOIN (SELECT TOP ... ) AS InnerSimpleQueue ON .. or maybe a CROSS APPLY.

EDIT: One more approach with a CTE:

WITH myCTE AS
(
    SELECT TOP(@BatchSize) MsgId
    FROM dbo.SimpleQueue
    WHERE IsDeleted = 0
    ORDER BY MsgId
)   
UPDATE dbo.SimpleQueue
        SET IsDeleted=1
FROM dbo.SimpleQeueu
INNER JOIN myCTE ON myCTE.MsgId=dbo.SimpleQueue.MsgId
Shnugo
  • 66,100
  • 9
  • 53
  • 114
  • Regarding ORDER BY I omitted that clause on purpose. Accordingly to MSDN "Rows can be retrieved according to the order specified with index creation. If the sort order of the index matches the sort order required for a particular query, for example if the index key matches the ORDER BY clause, there is no need to sort the rows as part of query execution." Source: https://msdn.microsoft.com/en-us/library/dn133166.aspx – João Antunes Sep 10 '15 at 14:07
  • I've tried to use UPDATE dbo.SimpleQueue SET IsDeleted = 1 WHERE dbo.SimpleQueue.MsgId IN (@result) but I still get this error: Must declare the scalar variable "@result". Please do not forget that you can't use subqueries in **native stored procedures** – João Antunes Sep 10 '15 at 14:12
  • @JoãoAntunes, well, I have no experience with in-memory tables (must have a look at!), but this sounds more like an explanation how the optimizer works,so there is no need for a double ORDER BY, if the sorting fits already. In general there is absolutely NO guarantee in which order you will get your result - unless you specify an ORDER BY in the outermost query (same with subselects...) – Shnugo Sep 10 '15 at 14:13
  • @JoãoAntunes,regrettfully you cannot use IN-clause with a paramter... But - as least in my cases - you can use IN-clause with a sub-select. Your "WITH NATIVE_COMPLIATION" should make this possible... – Shnugo Sep 10 '15 at 14:15
  • I'll add ORDER BY just for precaution :) since it will be ignored. Regarding the IN clause using SELECTS I can't use it because WHERE will have a subquery and will give this error: "Subqueries (queries nested inside another query) is only supported in SELECT statements with natively compiled modules." – João Antunes Sep 10 '15 at 14:21
  • Did you try the INNER JOIN? You could join to your @result, as it behaves like a table. The INNER JOIN will ensure that only fitting rows are processed... – Shnugo Sep 10 '15 at 14:24
  • I tried, but it gives the same error, because I cant use @result :( **Must declare the scalar variable "@result"**. I think the solution is to create 2 Native storedprocedures and try to apply a IN to the result of stored procedure, but I'm not sure how this is done and if it is possible. Going to study that approach. – João Antunes Sep 10 '15 at 14:35
  • @JoãoAntunes, Just edited my answer. Look right at the bottom. Maybe you can trick this out with a CTE... – Shnugo Sep 10 '15 at 14:52
  • @JoãoAntunes, There's no IN with the result of a SP. But you could try a User Defined Function or a VIEW... – Shnugo Sep 10 '15 at 14:54
  • In memory SQL doesn't allow CTE :( instead of that I'm using Types as Tables – João Antunes Sep 10 '15 at 15:00
  • On more idea: You could pass in your filter table as a parameter. so it will be declared... – Shnugo Sep 10 '15 at 15:02
  • 1
    @JoãoAntunes, and one more idea: It could be enough to add your results-variable to the parameters list. Your "must be declared" should be solved this way... – Shnugo Sep 10 '15 at 17:28
  • Thanks for your effort, but I can only add a parameter as read only :/ on native stored procedures. I think I have a possible solution for my problem, that is to make just a SELECT and UPDATE with the same clause. Since this is working on the same ATOMIC operation I'm working on my "own version" of the table, and other transactions can't change my values between SELECT and UPDATE, so the result will always be the same. After I confirm that the behavior is correct I'll update the response – João Antunes Sep 10 '15 at 17:49
  • Good luck! I'd probably try to go with a table valued function, getting the MsgID and returning the result you can then try to INNER JOIN. But - as told before -'m not familiar with this "in-memory" concept ... – Shnugo Sep 10 '15 at 18:54