61

In SQL Server 2008 I have a table CUSTOMERS that has two columns as:

ID, BALANCE

How can I write the query that selects the ID of the customer who has maximum balance, "in the most effective way"?

Option 1: ORDER BY BALANCE and SELECT TOP(1) --> costs too much.

Option 2: Firstly Get MAX amount, then make another query that uses the amount in where clause --> costs too much and not seem reliable.

Erik Philips
  • 53,428
  • 11
  • 128
  • 150
Ahmet Altun
  • 3,910
  • 9
  • 39
  • 64
  • 5
    If you add a non clustered index to Balance you should find that Option 1 is much quicker. – StuartLC Dec 05 '11 at 15:24
  • Do you want to know all the IDs in the event of a tie? It seems you are looking for a random one in the event of a tie as is. – Gary Jun 01 '17 at 21:02

5 Answers5

115

Note: An incorrect revision of this answer was edited out. Please review all answers.

A subselect in the WHERE clause to retrieve the greatest BALANCE aggregated over all rows. If multiple ID values share that balance value, all would be returned.

SELECT 
  ID,
  BALANCE
FROM CUSTOMERS
WHERE BALANCE = (SELECT MAX(BALANCE) FROM CUSTOMERS)
Michael Berkowski
  • 267,341
  • 46
  • 444
  • 390
  • 1
    Im no SQL expert but doesnt Group By have its own cost?? – Baz1nga Dec 05 '11 at 15:36
  • @Baz1nga - You cannot work on aggregate functions with additional columns without having a group by clause. Everything has a cost but at what is the real question? – JonH Dec 05 '11 at 16:33
  • 4
    Column BALANCE is invalid in the HAVING clause because it is not contained in either an aggregate function or the GROUP BY clause. – Serge Profafilecebook Dec 04 '14 at 11:25
  • 5
    @SergeProfafilecebook Strange - I wonder why this answer was accepted so many years ago. There should be a subquery in the `HAVING` or `WHERE`. – Michael Berkowski Dec 04 '14 at 11:49
  • What version of SQL Server is that original answer valid for? It doesn't even make any sense, much less actually run. (If you're aggregating by id, what is the balance being shown?) – MgSam Oct 24 '17 at 17:17
  • @MgSam No idea. I vaguely remember this answer reappearing 3 years ago. I can't delete it because it's accepted, though shouldn't have been. I'm going to edit out the top section entirely. Downvote freely. – Michael Berkowski Oct 24 '17 at 17:31
39

Here's an option if you have multiple records for each Customer and are looking for the latest balance for each (say they are dated records):

SELECT ID, BALANCE FROM (
    SELECT ROW_NUMBER() OVER (PARTITION BY ID ORDER BY DateModified DESC) as RowNum, ID, BALANCE
    FROM CUSTOMERS
) C
WHERE RowNum = 1
Gary
  • 3,254
  • 2
  • 27
  • 30
  • 4
    This is nice syntax, but I noticed the performance of `SELECT ROW_NUMBER() OVER (PARTITION BY ID ORDER BY DateModified DESC)` is really impact full when selecting a larger set of data – Joas Nov 28 '19 at 09:13
8

What do you mean costs too much? Too much what?

SELECT MAX(Balance) AS MaxBalance, CustomerID FROM CUSTOMERS GROUP BY CustomerID

If your table is properly indexed (Balance) and there has got to be an index on the PK than I am not sure what you mean about costs too much or seems unreliable? There is nothing unreliable about an aggregate that you are using and telling it to do. In this case, MAX() does exactly what you tell it to do - there's nothing magical about it.

Take a look at MAX() and if you want to filter it use the HAVING clause.

HK boy
  • 1,398
  • 11
  • 17
  • 25
JonH
  • 32,732
  • 12
  • 87
  • 145
3

Say, for an user, there is revision for each date. The following will pick up record for the max revision of each date for each employee.

select job, adate, rev, usr, typ 
from tbl
where exists (  select 1 from ( select usr, adate, max(rev) as max_rev 
                                from tbl
                                group by usr, adate 
                              ) as cond
                where tbl.usr=cond.usr 
                and tbl.adate =cond.adate 
                and tbl.rev =cond.max_rev
             )
order by adate, job, usr
Glorfindel
  • 21,988
  • 13
  • 81
  • 109
-2

The query answered by sandip giri was the correct answer, here a similar example getting the maximum id (PresupuestoEtapaActividadHistoricoId), after calculate the maximum value(Base)

select * 
from (
    select PEAA.PresupuestoEtapaActividadId,
        PEAH.PresupuestoEtapaActividadHistoricoId,             
        sum(PEAA.ValorTotalDesperdicioBase) as Base,
        sum(PEAA.ValorTotalDesperdicioEjecucion) as Ejecucion
    from hgc.PresupuestoActividadAnalisis as PEAA
    inner join hgc.PresupuestoEtapaActividad as PEA
        on PEAA.PresupuestoEtapaActividadId = PEA.PresupuestoEtapaActividadId
    inner join hgc.PresupuestoEtapaActividadHistorico as PEAH
        on PEA.PresupuestoEtapaActividadId = PEAH.PresupuestoEtapaActividadId                                                         
    group by PEAH.PresupuestoEtapaActividadHistoricoId, PEAA.PresupuestoEtapaActividadId    
) as t
where exists (
    select 1 
    from (
        select MAX(PEAH.PresupuestoEtapaActividadHistoricoId) as PresupuestoEtapaActividadHistoricoId                                                                     
        from hgc.PresupuestoEtapaActividadHistorico as PEAH                       
        group by PEAH.PresupuestoEtapaActividadId  
    ) as ti
    where t.PresupuestoEtapaActividadHistoricoId = ti.PresupuestoEtapaActividadHistoricoId 
)
Massimiliano Kraus
  • 3,638
  • 5
  • 27
  • 47