Partition By Scan To Seek

Question

I have a T1 table. For every value in column C1/C2, column C3 starts at 1.

CREATE TABLE t1 
(
    C1 int, 
    c2 int, 
    c3 int, 

    CONSTRAINT [pk_idx] PRIMARY KEY CLUSTERED ([c1] ASC, [c2] ASC, [c3] ASC) 
)

SELECT C1, C2, MAX(C3) AS MAX_C3 
FROM T1 
GROUP BY C1, C2

C1  C2  MAX_C3
--------------
1   1   46
1   2   540
1   3   6
1   4   2
1   5   579
2   1   46
2   2   540
2   3   6
2   4   2
2   5   579

This query retrieves all counts of C1 = 1. (number of rows read 1173).

I wonder if there is a direction where only 5 can be read.

SELECT 
    c1, c2, c3
FROM 
    (SELECT 
         c1, c2, c3,
         ROW_NUMBER() OVER (PARTITION BY [c2] ORDER BY [C3] DESC) AS TopRow
     FROM t1
     WHERE c1 = 1) R
WHERE 
    R.TopRow = 1

c1  c2  c3
-------------
1   1   46
1   2   540
1   3   6
1   4   2
1   5   579

I've been thinking about the MAX function, GROUP BY clause, or a subquery, but I can't find an answer.

yes this is possible using a recursive CTE - e.g. as here https://sqlperformance.com/2014/10/t-sql-queries/performance-tuning-whole-plan (similar to "Finding the Distinct Values") - the overhead of the seeks may well be greater than the scan unless you have many rows with the same `c1, c2` though — Martin Smith, Aug 02 '22 at 10:40
@MartinSmith Thank You so much :) I will check the URL you shared. — P.Lonnie, Aug 02 '22 at 10:52

score 1 · Accepted Answer · answered Aug 02 '22 at 22:36

yes this is possible using a recursive CTE - e.g. as here https://sqlperformance.com/2014/10/t-sql-queries/performance-tuning-whole-plan

With the CREATE TABLE in the question and the following example data

INSERT t1
SELECT TOP 100000 
    ROW_NUMBER() OVER (ORDER BY @@SPID) % 5,
    ROW_NUMBER() OVER (ORDER BY @@SPID) % 10,
    ROW_NUMBER() OVER (ORDER BY @@SPID)
FROM sys.all_columns c1, sys.all_columns c2

the following query returns 10 rows as the result and reads 10 rows from t1

WITH RecursiveCTE
     AS (SELECT TOP (1) C1, C2, c3
         FROM   t1
         ORDER  BY [c1] DESC, [c2] DESC, [c3] DESC
         UNION ALL
         SELECT C1, C2, c3
         FROM   (SELECT t1.*,
                        rn = ROW_NUMBER() OVER ( ORDER BY t1.[c1] DESC, t1.[c2] DESC, t1.[c3] DESC)
                 FROM   t1
                        JOIN RecursiveCTE R
                          ON t1.[c1] < R.[c1]
                              OR ( t1.[c1] = R.[c1]
                                   AND t1.[c2] < R.[c2] )) AS R
         WHERE  R.rn = 1)
SELECT *
FROM   RecursiveCTE

Unless you have a lot of rows per c1, c2 combination you may well find that just scanning all of them performs better though.

Thanks for making the query too. As you said, Clustered Idex scan seems efficient if the number of data of C1 = 1 is small. But the more data in C1 = 1 , the better the query you gave. After confirming the data distribution of C1 with the development team, we will improve the query. — P.Lonnie, Aug 03 '22 at 01:22

Partition By Scan To Seek

1 Answers1