Select one row per group without specifying order?

Question

Suppose I have this data:

create table People (ID int identity, Name nvarchar(50), Age int);   
insert into People values 
  ('Bill Jones', 50),
  ('Bill Jones', 12),
  ('Sam Smith', 23),
  ('Jill Brown', 44),
  ('Jill Brown', 67),
  ('Jill Brown', 3)

And this query:

select * from (
   select 
       ID, Name, Age, 
       row_number() over (partition by Name order by ID) [rownum]
   from People
) a where [rownum] = 1

It successfully returns me one person per unique name.

ID  NAME        AGE ROWNUM
1   Bill Jones  50  1
4   Jill Brown  44  1
3   Sam Smith   23  1

However, in order to use row_number(), I must specify an order by, causing the query plan to include an expensive sort operation.

query plan

I don't care about which person is returned; I just need one person per name.

Is there a way to do this without the sorting?

You can see my query and execution plan here: http://sqlfiddle.com/#!3/3ee32/1/0

Try using `order by 1/0`, a fake sort. It might game the QO enough to drop the Sort. — RichardTheKiwi, May 01 '13 at 05:06
Sorts aren't always expensive and don't trust the %s in the query plans. You can also try the variation `select Id, Name, Age from People where ID in (select min(ID) from people group by Name)` — RichardTheKiwi, May 01 '13 at 05:10
I think I'm an idiot actually.. it has to sort by Name anyway in order to partition. Changing to `order by 1/0` does stop it ordering by ID *as well as* Name though. — Blorgbeard, May 01 '13 at 05:17

Amit Singh · Answer 1 · 2013-05-01T05:47:43.623

2

I dont know it optmised one or not but it shows record as you want...without order by clause/.....

Select * from People a where id in (Select Top(1) id from people b where name in
                                  (Select name from people group by name) and a.name=b.name)

Sql Fidddle Demo Link

edited May 01 '13 at 05:47

answered May 01 '13 at 05:09

Amit Singh

8,039
20
29

TechDo · Answer 2 · 2013-05-01T07:21:13.907

1

How about the query:

select 
    distinct Name, 
    (Select top 1 Age from People b where b.Name=a.Name) Age 
from People a

OR

select b.* From(
    select min(ID) ID from people group by Name
    ) a inner join People b on a.ID=b.ID

with all columns. FACT: None of these queries beat the query with ROW_NUMBER()!

edited May 01 '13 at 07:21

answered May 01 '13 at 05:03

TechDo

18,398
3
51
64

Interesting, but I actually have several columns other than Age, so I'd need a subquery for each. – Blorgbeard May 01 '13 at 05:20

score 1 · Answer 3 · answered May 01 '13 at 05:21

1

You might consider this query:

SELECT 
  a.* 
FROM 
  People a 
LEFT JOIN 
  People b 
ON
  (a.Name = b.Name AND a.id > b.id) 
WHERE b.id IS NULL

When I run it in SQLFiddle, it seems to perform better:

Original query: 0.0146747

Self join: 0.0071784

answered May 01 '13 at 05:21

Eric S

1,363
1
10
20

1

Where do you see those numbers? I can only see "1ms" for both. Anyway, I translated it to my real table, and it seems to be slower - two table scans instead of one scan and a sort. Interesting technique though! – Blorgbeard May 01 '13 at 05:32

score 1 · Answer 4 · answered May 01 '13 at 05:57

I actually see different ways to write the original query. You can also consider using Common Table Expression. Though i believe the optimization level will be almost the same. i did prefer CTE though.

with cte
as
(
    select ID, Name, Age, 
           row_number() over (partition by Name order by ID) [rownum]
    from People
)

select * 
from cte 
where [rownum] = 1

Select one row per group without specifying order?

4 Answers4