1

I have a T1 table. For every value in column C1/C2, column C3 starts at 1.

CREATE TABLE t1 
(
    C1 int, 
    c2 int, 
    c3 int, 

    CONSTRAINT [pk_idx] PRIMARY KEY CLUSTERED ([c1] ASC, [c2] ASC, [c3] ASC) 
)
SELECT C1, C2, MAX(C3) AS MAX_C3 
FROM T1 
GROUP BY C1, C2
C1  C2  MAX_C3
--------------
1   1   46
1   2   540
1   3   6
1   4   2
1   5   579
2   1   46
2   2   540
2   3   6
2   4   2
2   5   579

This query retrieves all counts of C1 = 1. (number of rows read 1173).

I wonder if there is a direction where only 5 can be read.

SELECT 
    c1, c2, c3
FROM 
    (SELECT 
         c1, c2, c3,
         ROW_NUMBER() OVER (PARTITION BY [c2] ORDER BY [C3] DESC) AS TopRow
     FROM t1
     WHERE c1 = 1) R
WHERE 
    R.TopRow = 1
c1  c2  c3
-------------
1   1   46
1   2   540
1   3   6
1   4   2
1   5   579

I've been thinking about the MAX function, GROUP BY clause, or a subquery, but I can't find an answer.

Thom A
  • 88,727
  • 11
  • 45
  • 75
P.Lonnie
  • 105
  • 2
  • 11
  • 1
    yes this is possible using a recursive CTE - e.g. as here https://sqlperformance.com/2014/10/t-sql-queries/performance-tuning-whole-plan (similar to "Finding the Distinct Values") - the overhead of the seeks may well be greater than the scan unless you have many rows with the same `c1, c2` though – Martin Smith Aug 02 '22 at 10:40
  • @MartinSmith Thank You so much :) I will check the URL you shared. – P.Lonnie Aug 02 '22 at 10:52

1 Answers1

1

yes this is possible using a recursive CTE - e.g. as here https://sqlperformance.com/2014/10/t-sql-queries/performance-tuning-whole-plan

With the CREATE TABLE in the question and the following example data

INSERT t1
SELECT TOP 100000 
    ROW_NUMBER() OVER (ORDER BY @@SPID) % 5,
    ROW_NUMBER() OVER (ORDER BY @@SPID) % 10,
    ROW_NUMBER() OVER (ORDER BY @@SPID)
FROM sys.all_columns c1, sys.all_columns c2

the following query returns 10 rows as the result and reads 10 rows from t1

WITH RecursiveCTE
     AS (SELECT TOP (1) C1, C2, c3
         FROM   t1
         ORDER  BY [c1] DESC, [c2] DESC, [c3] DESC
         UNION ALL
         SELECT C1, C2, c3
         FROM   (SELECT t1.*,
                        rn = ROW_NUMBER() OVER ( ORDER BY t1.[c1] DESC, t1.[c2] DESC, t1.[c3] DESC)
                 FROM   t1
                        JOIN RecursiveCTE R
                          ON t1.[c1] < R.[c1]
                              OR ( t1.[c1] = R.[c1]
                                   AND t1.[c2] < R.[c2] )) AS R
         WHERE  R.rn = 1)
SELECT *
FROM   RecursiveCTE 

Unless you have a lot of rows per c1, c2 combination you may well find that just scanning all of them performs better though.

Martin Smith
  • 438,706
  • 87
  • 741
  • 845
  • Thanks for making the query too. As you said, Clustered Idex scan seems efficient if the number of data of C1 = 1 is small. But the more data in C1 = 1 , the better the query you gave. After confirming the data distribution of C1 with the development team, we will improve the query. – P.Lonnie Aug 03 '22 at 01:22