-1

Spent a few hours trying to debug the query and understand why it keeps scanning the whole table after the result has been fetched.

The query calculates the running total and should stop execution as soon as the limit is reached or exceeded. [TotalSize] >= @maxBlobSize

WITH TopRows AS ( 
                    SELECT 
                        [Id], 
                        [BinaryData], 
                        [CreatedAt], 
                        Sum([BinaryDataSize]) OVER (ORDER BY [CreatedAt]) as TotalSize 
                        FROM [dbo].[MessageLogs]) 
                    SELECT * 
                    FROM TopRows 
                    WHERE [TotalSize] < @maxBlobSize -- 10.000.000
                    ORDER BY [CreatedAt];
Evgeny Belov
  • 308
  • 4
  • 12

1 Answers1

2

The query calculates the running total and should stop execution as soon as the limit is reached or exceeded.

You don't understand how SQL works. A SQL SELECT statement describes the result set. It does not specify how the query is going to be run.

In your case, you have an operation that requires scanning the entire table: the window function. The SQL engine is going to calculate the cumulative sum over the entire table before the query finishes.

In our case, you are using SUM() and the engine doesn't know that BinaryDataSize is never negative. Even if it did, I don't think any SQL engine would be smart enough to short circuit evaluation of the window function.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786