0

Checking my execution plan it shows a warning in a Top N Sort step

Operator used tempdb to spill data during execution with spill level 1 and 4 spilled thread(s), Sort wrote 4609 pages to and read 4609 pages from tempdb with granted memory 107360KB and used memory 107360KB

Is this a bad thing and how should I tackle it?

The original query was

select a.* from TableA a where id in (
    select a.id from TableA a join TableAB ab on a.id = ab.aid join TableB b on ab.bid = b.id
    order by a.datetime desc offset 1000000 rows fetch next 10 rows only
)

In case you wonder, the above was meant to be a faster version of

select a.* from TableA a join TableAB ab on a.id = ab.aid join TableB b on ab.bid = b.id
order by a.datetime desc offset 1000000 rows fetch next 10 rows only

but although it ran faster, it gave that warning message above.

user1589188
  • 5,316
  • 17
  • 67
  • 130
  • @shawnt00 on the contrary, TableA has a clustered index on the datetime column – user1589188 Mar 14 '23 at 04:15
  • 1
    Spill to disk (caching data on disk) isn't great - it takes a lot longer than just sorting in memory, but happens with large sorts and may still be acceptable, depending on your case. If possible, I try to remove them. In your specific situation, note that the two queries are not the same: if a is linked to multiple b records, the top one only shows the a record once, but the bottom multiple times. Also, is table b actually needed? Can you just use ab? Finally, an index (clustered or nonclustered) on ab, columns a_id then b_id, should remove the sort anyway. – seanb Mar 14 '23 at 05:59
  • What is the use case for this anyway? If it is a pagination query then just don't let people navigate to page 100,000 with 10 results per page. Consider using keyset pagination instead. – Martin Smith Mar 14 '23 at 15:01

1 Answers1

0

As a rule, it's preferable using EXISTS over IN clause, this is assuming that you don't want duplicates from TableA as @seanb. a possible rewriting to the query using nested EXISTS would look like

select a.* 
from TableA a 
where exists(
    select 1 
    from TableA a1
    where a1.id = a.Id
        and exists(
            select 1 
            from TableAB ab 
            where a1.id = ab.aid 
                and exists (
                    select 1
                    from TableB b 
                    where ab.bid = b.id
                )
            )
    order by a1.datetime desc offset 1000000 rows fetch next 10 rows only
)

The SORT is inevitable, but it should happen on fewer columns than using the ORDER inside the IN clause. Even if you see the spill warning, you can check the number of pages is reduced, IMO, 4609 pages (~36MB) spill shouldn't make much difference, unless you are running this every few seconds.

Luis LL
  • 2,912
  • 2
  • 19
  • 21
  • There is no such "rule". `EXISTS` and `IN` get the same plan. (unlike `NOT EXISTS` and `NOT IN` which have different semantics and plans) – Martin Smith Mar 14 '23 at 11:48
  • Thanks, but funny though, using `EXISTS` does not return any result with the `OFFEST`. And it is not as fast as the `IN` although it does eliminate the warning as it uses a different plan. – user1589188 Mar 14 '23 at 14:17
  • To be clear the different plan isn't due to `EXISTS` vs `IN`. It will be because the various joins inside the `IN` also got turned to `EXISTS`. This could be written with the same nested `IN` and give the same plan – Martin Smith Mar 14 '23 at 14:57