2

Suppose I have this data:

create table People (ID int identity, Name nvarchar(50), Age int);   
insert into People values 
  ('Bill Jones', 50),
  ('Bill Jones', 12),
  ('Sam Smith', 23),
  ('Jill Brown', 44),
  ('Jill Brown', 67),
  ('Jill Brown', 3)

And this query:

select * from (
   select 
       ID, Name, Age, 
       row_number() over (partition by Name order by ID) [rownum]
   from People
) a where [rownum] = 1

It successfully returns me one person per unique name.

ID  NAME        AGE ROWNUM
1   Bill Jones  50  1
4   Jill Brown  44  1
3   Sam Smith   23  1

However, in order to use row_number(), I must specify an order by, causing the query plan to include an expensive sort operation.

query plan

I don't care about which person is returned; I just need one person per name.

Is there a way to do this without the sorting?

You can see my query and execution plan here: http://sqlfiddle.com/#!3/3ee32/1/0

Blorgbeard
  • 101,031
  • 48
  • 228
  • 272
  • Try using `order by 1/0`, a fake sort. It might game the QO enough to drop the Sort. – RichardTheKiwi May 01 '13 at 05:06
  • 3
    Sorts aren't always expensive and don't trust the %s in the query plans. You can also try the variation `select Id, Name, Age from People where ID in (select min(ID) from people group by Name)` – RichardTheKiwi May 01 '13 at 05:10
  • I think I'm an idiot actually.. it has to sort by Name anyway in order to partition. Changing to `order by 1/0` does stop it ordering by ID *as well as* Name though. – Blorgbeard May 01 '13 at 05:17

4 Answers4

2

I dont know it optmised one or not but it shows record as you want...without order by clause/.....

Select * from People a where id in (Select Top(1) id from people b where name in
                                  (Select name from people group by name) and a.name=b.name)

Sql Fidddle Demo Link

Amit Singh
  • 8,039
  • 20
  • 29
1

How about the query:

select 
    distinct Name, 
    (Select top 1 Age from People b where b.Name=a.Name) Age 
from People a

OR

select b.* From(
    select min(ID) ID from people group by Name
    ) a inner join People b on a.ID=b.ID

with all columns. FACT: None of these queries beat the query with ROW_NUMBER()!

TechDo
  • 18,398
  • 3
  • 51
  • 64
1

You might consider this query:

SELECT 
  a.* 
FROM 
  People a 
LEFT JOIN 
  People b 
ON
  (a.Name = b.Name AND a.id > b.id) 
WHERE b.id IS NULL

When I run it in SQLFiddle, it seems to perform better:

Original query: 0.0146747

Self join: 0.0071784

Eric S
  • 1,363
  • 1
  • 10
  • 20
  • 1
    Where do you see those numbers? I can only see "1ms" for both. Anyway, I translated it to my real table, and it seems to be slower - two table scans instead of one scan and a sort. Interesting technique though! – Blorgbeard May 01 '13 at 05:32
1

I actually see different ways to write the original query. You can also consider using Common Table Expression. Though i believe the optimization level will be almost the same. i did prefer CTE though.

with cte
as
(
    select ID, Name, Age, 
           row_number() over (partition by Name order by ID) [rownum]
    from People
)

select * 
from cte 
where [rownum] = 1
Praveen Nambiar
  • 4,852
  • 1
  • 22
  • 31