8

The other day, I gave an answer to this question but then other user solved that problem with sum + case conditional statement to add one edge condition in result. So, the question came to my mind, how statement sum(case when jobname = 'Analyst' then 1 else 0 end) in the below query works

select d.*
from (select deptno,
         sum(case when jobname = 'Analyst' then 1 else 0 end) as numAnalysts
      from employees
      group by deptno
      order by numAnalysts asc
     ) d
where rownum = 1;`

and return the number of employees over a department. Also, I would like to understand the performance of this query.

Before posting this question, I read this, this and this but still didn't get how this works.

Community
  • 1
  • 1
codiacTushki
  • 750
  • 1
  • 9
  • 22
  • the first step to understanding performance of a query, is to use explain on the query. https://dev.mysql.com/doc/refman/5.0/en/using-explain.html. as for how the case conditional works, it is simply an if / else style construct. when you sum a field, you total all the values of the field. in the above, instead of summing a field, you are essentially summing the result of a function that operates on the field, and returns either 1 or 0. – pala_ Mar 30 '15 at 00:02
  • Where does rownum come from? – Strawberry Mar 30 '15 at 00:12
  • rownum is the part of outer query and that is used to return the row with minimum no of analysts in department. That query is the answer for [this](http://stackoverflow.com/questions/29316840/get-the-minimum-employees-with-a-given-job/29316973#29316973) – codiacTushki Mar 30 '15 at 00:16
  • rownum is oracle. not sure why this is tagged mysql tbh, but its basically equivalent to 'limit 1' in this instance. in which case my comment above about mysql's 'explain' is pretty useless. – pala_ Mar 30 '15 at 00:18
  • The referenced question was tagged Oracle and this one probably should be as well. – Gordon Linoff Mar 30 '15 at 00:20

2 Answers2

14

Presumably, this is the part that you are struggling to understand:

  select deptno,
         sum(case when jobname = 'Analyst' then 1 else 0 end) as numAnalysts
  from employees
  group by deptno

This is a simple aggregation query, really. What the query is doing is:

  • Look at each row in employees
  • If jobname is 'Analyst' then assign the value of 1 (this is the case statement. Otherwise, assign a value of0`.
  • Aggregate by department, summing the value just calculated. This has the effect of counting the number of analysts.

case is an expression that returns a value. The sum() is simply adding up that value for each group.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
6

Lets try to split the problem in 2 parts.

First, let suppose you want a field saying if the jobname is 'Analyst' or not.

SELECT
  deptno,
  CASE WHEN jobname = 'Analyst' THEN 1 ELSE 0 END AS IsAnalyst
FROM
  employees

This query is going to return 0 for all the jobname that are no 'Analyst' and 1 for all the ones that are 'Analyst'

At this point you will have something like this

deptno IsAnalyst
1      1
1      0
1      0
2      0
2      1
2      1
2      1
2      0

Second, You want to summarize this information by department

SELECT
  deptno,
  SUM(CASE WHEN jobname = 'Analyst' THEN 1 ELSE 0 END) AS numAnalysts
FROM
  employees
GROUP BY
  deptno

You are applying a sum to all those value and grouping by deptno.

At this point (I removed the order by from the query to simplify it) you will have the following output

deptno numAnalysts
1      1
2      3

I think that an example is worth a thousand words

Hope this helps

CrApHeR
  • 2,595
  • 4
  • 25
  • 40