Can I group by in SQL query with window function?

Question

I need to get employees with smallest salary in their departments I did it using anti join.

     select emp.employee_id,emp.last_name,emp.salary,emp.department_id
     from employees emp
     left join employees sml 
     on sml.department_id = emp.department_id and sml.salary < emp.salary
     where sml.employee_id is null and emp.department_id is not null

But I've been told that it's possible to do it using window function using one select. However I can't group it by department_id and use it at the same time. Is that a bug or me being stupid?

     SELECT  department_id,
     min(salary) OVER (partition by department_id)  as minsalary
     FROM employees;
     GROUP BY department_id

SQL Developer says 00979. 00000 - "not a GROUP BY expression"

AcePL · Answer 1 · 2022-03-02T08:00:38.043

First thing to remember is that windowed functions (like OVER() clause) work on the result of the query. That is: Server first executes the query and only then applies the windowed function as defined by you (of course, it's an oversimplification of what actually happens, but good enough to illustrate my point).

This means that you can actually use windowed function and group by clause in the same query, but you need to encapsulate group by aggregate with windowed function aggregate, like this:

SELECT department_id,
       min(min(salary)) OVER (partition by department_id) as minsalary
FROM employees
GROUP BY department_id;

However, I agree that this is not a good place to use windowed function. Matt's proposition - which I upvoted, full disclosure - is best here (ROW_NUMBER() in CTE or subquery, then selecting only the desired rows in main SELECT).

Is the colon placed in the correct place? – alexis Apr 23 '21 at 17:52 — alexis, Apr 23 '21 at 17:52
@alexis No, of course. Interesting how that happened. – AcePL Apr 23 '21 at 17:59 — AcePL, Apr 23 '21 at 17:59

score 10 · Accepted Answer · answered Nov 14 '16 at 17:50

If you run your second query without the group by - which you may have already tried, from the extra semicolon in what you posted - you'll see that you get one row for every employee, each showing the minimum salary in their department. That minimum is the analytic min() because it has a window clause. The PARTITION BY is the equivalent of a GROUP BY, but without the aggregation over the whole result set.

The simplest way to get the same result (almost) is to use the RANK() analytic function instead, which ranks the values based on the partition and order you supply, while allowing for ties:

SELECT employee_id, last_name, salary, department_id,
  RANK() OVER (PARTITION BY department_id ORDER BY salary) AS rnk
FROM employees
ORDER BY department_id, rnk;

EMPLOYEE_ID LAST_NAME                     SALARY DEPARTMENT_ID        RNK
----------- ------------------------- ---------- ------------- ----------
        200 Whalen                          4400            10          1
        202 Fay                             6000            20          1
        201 Hartstein                      13000            20          2
        119 Colmenares                      2500            30          1
        118 Himuro                          2600            30          2
        117 Tobias                          2800            30          3
        116 Baida                           2900            30          4
        115 Khoo                            3100            30          5
        114 Raphaely                       11000            30          6
...
        102 De Haan                        17000            90          1
        101 Kochhar                        17000            90          1
        100 King                           24000            90          3
...

For departments 20 and 30 you can see the row ranked 1 is the lowest salary. For department 90 there are two employees ranked 1, because they have the same lowest salary.

You can use that as an inline view and select just those rows ranked number 1:

SELECT employee_id, last_name, salary, department_id
FROM (
  SELECT employee_id, last_name, salary, department_id,
    RANK() OVER (PARTITION BY department_id ORDER BY salary) AS rnk
  FROM employees
)
WHERE rnk = 1
ORDER BY department_id;

EMPLOYEE_ID LAST_NAME                     SALARY DEPARTMENT_ID
----------- ------------------------- ---------- -------------
        200 Whalen                          4400            10
        202 Fay                             6000            20
        119 Colmenares                      2500            30
        203 Mavris                          6500            40
        132 Olson                           2100            50
        107 Lorentz                         4200            60
        204 Baer                           10000            70
        173 Kumar                           6100            80
        101 Kochhar                        17000            90
        102 De Haan                        17000            90
        113 Popp                            6900           100
        206 Gietz                           8300           110
        178 Grant                           7000              

13 rows selected.

If you didn't have to worry about ties there is an even simpler alternative, but it ins't appropriate here.

Notice that this gives you one more row than your original query. You are joining on sml.department_id = emp.department_id. If the department ID is null, as it is for employee 178, that join fails because you can't compare null to null with equality tests. Because this solution doesn't have a join, that doesn't apply, and you see that employee in the results.

it works fine, thank you. But tell me pease, is there an easier way? Seems a bit long to me. I thought that there might be some other way to group it, much more pleasant and obvious. — E.Saraf, Nov 14 '16 at 18:21
@E.Saraf - [FIRST](https://docs.oracle.com/cd/E11882_01/server.112/e41084/functions065.htm#SQLRF00641) is an alternative, but you only get one row per department - so you'd only get 101 or 102, not both, for department 90. — Alex Poole, Nov 14 '16 at 20:21

Matt · Answer 3 · 2016-11-14T18:13:03.423

5

WITH cte AS (
    SELECT
       emp.*
       ,ROW_NUMBER() OVER (PARTITION BY emp.department_id ORDER BY emp.salary) as RowNumber
    FROM
       employees emp
)

SELECT c.*
FROM
    cte c
WHERE
    c.RowNumber = 1

You can use ROW_NUMBER() to get 1 row of lowest salary by department as above. If you want all rows in the case of ties switch it to RANK()

Otherwise you can do it with MIN() OVER but this will give you ties

WITH cte AS (
    SELECT
       emp.*
       ,MIN(emp.salary) OVER (PARTITION BY emp.department_id) as DeptMinSalary
    FROM
       employees emp
)

SELECT c.*
FROM
    cte c
WHERE
    c.salary = c.DeptMinSalary

As a derived table instead of a Common Table Expression:

SELECT t.*
FROM
    (SELECT
       emp.*
       ,ROW_NUMBER() OVER (PARTITION BY emp.department_id ORDER BY emp.salary) as RowNumber
    FROM
       employees emp) t
WHERE
    t.RowNumber = 1

One last thought on the subject because you ask "Can I group by in a SQL query with a window function?" Alex covers that the PARTITION BY is like a sub grouping within the Window Function. But to use a GROUP BY grouping with a Window function means that the GROUP BY result set would be evaluated PRIOR to the Window Function being evaluated.

edited Nov 14 '16 at 18:13

answered Nov 14 '16 at 17:47

Matt

13,833
2
16
28

Dorry, i didn't sepcified that I've been told that it's possible to do with one SELECT statements. Is it so? – E.Saraf Nov 14 '16 at 17:57
this is still considered 1 select as it is a common table expression. Basically you can move the cte into a derived table if you want. But the answer is NO it cannot be with only 1 select with a window function because the where statement is evaluated before a window function. But the 2 selects is still considered 1 query – Matt Nov 14 '16 at 17:59
`select * from employees` is fine; but `select *, from employees` [isn't allowed in Oracle](http://stackoverflow.com/q/1751856/266304) - unless you prefix the `*`, with the table name/alias. So your latest edit is fine (already +1 though!) – Alex Poole Nov 14 '16 at 18:10
@AlexPoole oh okay good to know, definitely different for Orcale than SQL-server and some of the others on that one then. Thanks! +1 to you too for your explanation and detailing! – Matt Nov 14 '16 at 18:14

score 2 · Answer 4 · answered Nov 14 '16 at 17:37

2

you do not need window function in this case, cause a simple group by would work too.

And the error is correct, cause the window function isn't an aggregat function. And a window function can't be a Group by- member.

But you could use "distinct" instead.

SELECT DISTINCT department_id,
     min(salary) OVER (partition by department_id)  as minsalary
FROM employees;

In your Special case all this is oversized, of course. But I think understanding is the name of the game.

answered Nov 14 '16 at 17:37

am2

380
5
21

this would only give the smallest salary per department not the actual employee data – Matt Nov 14 '16 at 17:46
@Matt right. I need smallest salary by departments, with employees name. – E.Saraf Nov 14 '16 at 17:53
1

...And because the windowed function work the way they do, this is one of the very few (if not the only) situations where distinct and OVER() clauses will work properly. In most cases you'd not get the results you think you will get. – AcePL Jan 14 '20 at 15:09

score 0 · Answer 5 · edited Jan 28 '23 at 01:56

0

SELECT t.employee_id, t.department_id, t.last_name, t.salary 
FROM (SELECT employee_id, department_id, last_name, salary, 
             MIN(salary) OVER(PARTITION BY department_id) AS dept_min_salary 
      FROM employees) t
WHERE t.salary = t.dept_min_salary;

edited Jan 28 '23 at 01:56

lemon

14,875
6
18
38

answered Jan 26 '23 at 09:01

Manoj Ponnachana

1
2

Please include a brief explanation of [how and why this solves the problem](https://meta.stackoverflow.com/q/392712/13138364). This will help future readers to better understand your solution. - [From Review](https://stackoverflow.com/review/late-answers/33711830) – tdy Feb 01 '23 at 22:54

score -1 · Answer 6 · answered Oct 22 '22 at 00:10

Since window function is not a aggregate function, you need to placed non aggregated labels like as department_id, salary under group by. Looking at your question, not suggested to use window function.

SELECT  department_id,
     min(salary) OVER (partition by department_id)  as minsalary
     FROM employees;
     GROUP BY department_id, salary;

Can I group by in SQL query with window function?

6 Answers6

Linked