95

In a test at university there was a question; is it possible to use an aggregate function in the SQL WHERE clause.

I always thought this isn't possible and I also can't find any example how it would be possible. But my answer was marked false and now I want to know in which cases it is possible to use an aggregate function in the WHERE. Also if it isn't possible it would be nice to get a link to the specification where it is described.

Simon Martin
  • 4,203
  • 7
  • 56
  • 93
n3on
  • 2,051
  • 1
  • 16
  • 16
  • WHERE determines the set of row that grouping & aggregation is done on, so how can you use the aggregation to determine the rows? What is your reasoning, with reference to authoritative documentation? Otherwise you're just aking for us to write yet another presentation of the language with no details of what you misunderstand or do or don't understand. How are you stuck finding or understanding any reasonable presentation of grouping/aggregation? Anyway asking for off-site resources is off-topic. – philipxy Jun 08 '22 at 15:19

8 Answers8

136

HAVING is like WHERE with aggregate functions, or you could use a subquery.

select EmployeeId, sum(amount)
from Sales
group by Employee
having sum(amount) > 20000

Or

select EmployeeId, sum(amount)
from Sales
group by Employee
where EmployeeId in (
    select max(EmployeeId) from Employees)
Jason Goemaat
  • 28,692
  • 15
  • 86
  • 113
  • 3
    yes, that it is possible with having i do know. But the WHERE was given. I think it's a definition thing. – n3on Jun 11 '11 at 23:59
  • 2
    Thank you! Just what I needed to know and understand. Up vote. – Lukas Sep 11 '13 at 22:30
  • 1
    thanks, I was going to leave my query in assignment and then i find this one...your answer really help me – A.s. Bhullar Nov 29 '13 at 18:00
  • 1
    Thanks, this saved me a lot of time, all of the other sources out there didn't give the simplicity you did. Perfect answer. – Jeff Apr 26 '15 at 23:57
  • what has to do [select max(EmployeeId) from Employees] that is in the second example with the first example [having sum(amount) > 20000]. – user3944364 Jun 26 '21 at 13:57
36

You haven't mentioned the DBMS. Assuming you are using MS SQL-Server, I've found a T-SQL Error message that is self-explanatory:

"An aggregate may not appear in the WHERE clause unless it is in a subquery contained in a HAVING clause or a select list, and the column being aggregated is an outer reference"

http://www.sql-server-performance.com/


And an example that it is possible in a subquery.

Show all customers and smallest order for those who have 5 or more orders (and NULL for others):

SELECT a.lastname
     , a.firstname
     , ( SELECT MIN( o.amount )
         FROM orders o
         WHERE a.customerid = o.customerid
           AND COUNT( a.customerid ) >= 5
        )
        AS smallestOrderAmount
FROM account a
GROUP BY a.customerid
       , a.lastname
       , a.firstname ;

UPDATE.

The above runs in both SQL-Server and MySQL but it doesn't return the result I expected. The next one is more close. I guess it has to do with that the field customerid, GROUPed BY and used in the query-subquery join is in the first case PRIMARY KEY of the outer table and in the second case it's not.

Show all customer ids and number of orders for those who have 5 or more orders (and NULL for others):

SELECT o.customerid
     , ( SELECT COUNT( o.customerid )
         FROM account a
         WHERE a.customerid = o.customerid
           AND COUNT( o.customerid ) >= 5
        )
        AS cnt
FROM orders o
GROUP BY o.customerid ;
ypercubeᵀᴹ
  • 113,259
  • 19
  • 174
  • 235
Tim Schmelter
  • 450,073
  • 74
  • 686
  • 939
  • 1
    Yes I didn't mention a DBMS because there isn't any specified. It just says SQL -_- – n3on Jun 11 '11 at 23:48
  • 1
    I think this possibility was added in the SQL-92 specs. No idea when various products added the functionality. Testing only with MySQL and SQL-Server shows slighly different behaviour (SQL-Server being more strict and probably more close to the specs). It would be interesting, if anyone else could check for other SQL implementations. – ypercubeᵀᴹ Jun 12 '11 at 10:24
  • While purists will disagree, most places where I have worked say SQL to mean Microsoft SSMS, using TSQL – JosephDoggie Oct 30 '19 at 17:36
  • @JosephDoggie If it says *SQL* , ask about [*SQL*](https://en.wikipedia.org/wiki/SQL) , which is language defined in a standard. – Anton Shepelev Feb 17 '22 at 14:04
15

You can't use an aggregate directly in a WHERE clause; that's what HAVING clauses are for.

You can use a sub-query which contains an aggregate in the WHERE clause.

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
  • 1
    I know it is possible as subquery, but i'm not sure if then i can say i can use an aggregat function in the WHERE... i thinks it's a definition thing. – n3on Jun 12 '11 at 00:01
  • 1
    @n3on: I agree...I would argue that it is not possible to use aggregates directly in a WHERE clause - as I said. It is only possible to use them as part of a sub-query - and that would not count as 'in a WHERE clause' in my book. If you give the caveated, nuanced answer, I don't see how they can fault you. If it is a b****y multi-choice question, then you're more nearly stuck. – Jonathan Leffler Jun 12 '11 at 02:17
  • 2
    See Tim's answer. It is possible. – ypercubeᵀᴹ Jun 12 '11 at 10:43
11

UPDATED query:

select id from t where id < (select max(id) from t);

It'll select all but the last row from the table t.

cspolton
  • 4,495
  • 4
  • 26
  • 34
Chandranshu
  • 3,669
  • 3
  • 20
  • 37
  • 2
    I believe this will depend on the DBMS as Tim mentioned. In standard SQL, you'll have to write it as `SELECT id FROM t WHERE id < (SELECT MAX(id) FROM t)` – Coding District Jun 11 '11 at 23:46
  • 1
    Yeah, you're right. I just remembered that it was possible from my early database tutorials where we were required to select top 5 rows without using TOP or LIMIT or ROWNUM. – Chandranshu Jun 11 '11 at 23:49
  • 2
    Yes but then the aggregat function is in the select clause from the subquery and not in the WHERE. And I think a subquery can't be seen as an aggregat function. – n3on Jun 11 '11 at 23:52
10
SELECT COUNT( * )   
FROM agents   
HAVING COUNT(*)>3;  

See more below link:

Samir Lakhani
  • 685
  • 10
  • 19
4

Another solution is to Move the aggregate fuction to Scalar User Defined Function

Create Your Function:

CREATE FUNCTION getTotalSalesByProduct(@ProductName VARCHAR(500))
RETURNS INT
AS
BEGIN

DECLARE @TotalAmount INT

SET @TotalAmount = (select SUM(SaleAmount) FROM Sales where Product=@ProductName)

RETURN @TotalAmount

END

Use Function in Where Clause

SELECT ProductName, SUM(SaleAmount) AS TotalSales
FROM Sales
WHERE dbo.getTotalSalesByProduct(ProductName)  > 1000
GROUP BY Product

References:

1. 2.

Hope helps someone.

Shaiju T
  • 6,201
  • 20
  • 104
  • 196
1

If you are using an aggregate function in a where clause then it means you want to filter data on the basis of that aggregation function. In my case, it's SUM(). I'll jump to the solution.

(select * from(select sum(appqty)summ,oprcod from pckwrk_view group by oprcod)AS asd where summ>500)

  1. The inner query is used to fetch results that need to be filtered.
  2. The aggregate function which has to filter out must be given an ALIAS name because the actual name of the column inside an aggregate function is not accessible or recognized by the outer query.
  3. Finally, the filter can be applied to the aliased name of the column in the inner query
David Lee
  • 665
  • 7
  • 20
RTZ
  • 11
  • 1
0

Try this one

select SUM(RecQty) RecQty,ItemCode from 
CostLedger group by ItemCode
having sum(RecQty) > 2000
Simas Joneliunas
  • 2,890
  • 20
  • 28
  • 35
  • Welcome to SO! Please don't post code-only answers but add a little textual explanation about how and why your approach works and what makes it different from the other answers given. You may also have a look at our ["How to write a good answer"](https://stackoverflow.com/help/how-to-answer) entry. – ahuemmer Jul 29 '22 at 10:01