8

I am trying to use CASE and GROUP BY together to conditionally filter results only if they match the CASE criteria, and if they don't, return results as if there were no GROUP BY criteria specified.

Here's a simple model of what I have:

es.id | es.acct_num | p.id | p.name
1001  | 4306-0      | 1569 | The High School
1002  | 4306-0      | 1569 | The High School
665   | 5906-7      | 981  | Rec Center
783   | 5906-7      | 1221 | The Gym

and here's what I would like to see:

es.id | es.acct_num | p.id | p.name
1001  | 4306-0      | 1569 | The High School
1002  | 4306-0      | 1569 | The High School
0     | 5906-7      | 0    | MULTI-SITE

Since es.acct_num 5906-7 has more than 1 distinct associated p.id, I would like it grouped by the es.acct_num as 1 line item, then have the es.id, p.id represented by a '0', and have the p.name represented by the string 'MULTI-SITE'.

However, since es.acct_num 4306-0 only has exactly 1 distinct associated p.id, I would like these all returned as individual line items as if no grouping condition was applied.

The solution escapes me. How can I accomplish this?

Linger
  • 14,942
  • 23
  • 52
  • 79
Hamking
  • 155
  • 1
  • 2
  • 8

3 Answers3

9

You can do this with a conditional aggregations:

SELECT (case when count(*) > 1 then 0 else max(es.id) end) as esid,
       es.acct_num,
       (case when count(*) > 1 then 0 else max(p.id) end) as pid,
       (case when count(*) > 1 then 'MULTI-SITE' else p.name end) as name,
       COUNT(*) AS cnt
FROM es join
     p
     on . . . 
GROUP BY es.acct_num;

Don't be fooled by the max() expressions. There is only one row for that part of the case, so these simply return the value in that row.

EDIT:

I think you need to do this with a correlated subquery, join/aggregation, or by counting the values using variables. This would be much easier in almost any other database, using window functions. Here is the idea, which is a bit hard to express in detail without seeing the full query:

SELECT (case when esp.cnt > 1 then 0 else max(es.id) end) as esid,
       es.acct_num,
       (case when esp.cnt > 1 then 0 else max(p.id) end) as pid,
       (case when esp.cnt > 1 then 'MULTI-SITE' else p.name end) as name
FROM es join
     p
     on . . .  join
     (select es.acct_num, count(distinct p.id) as cnt
      from . . .
      group by es.acct_num
     ) esp
     on esp.acct_num = es.acct_num
GROUP BY es.acct_num, (case when cnt = 0 then es.id end);
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • Unfortunately this still isn't quite right. I'm getting only 2 rows returned (one for each unique acct_num) and they both state "MULTI-SITE". I should get back 3 rows as in the original diagram: 2 should return actual es.id rows and 1 grouped "MULTI-SITE" row. – Hamking Jul 18 '14 at 18:47
  • @Hamking . . . I had the logic backwards. I think the edited version does what you want. – Gordon Linoff Jul 18 '14 at 18:56
  • +1. The second query, with the `JOIN` to the inline view to get count, looks like the right approach; but to get the two rows collapsed into a single MULTI-SITE row, we'd really need a `GROUP BY` clause on the outer query. We need to know what makes each row "unique"... if (es.id,es.acct_num,p.id) tuple is unique, then we could `GROUP BY (esid,acct_num,pid)` (referencing the expressions in the SELECT list). If that's not unique enough, we'd need another expression in the `GROUP BY`, following the same pattern used for the `acct_num` and `pid` expressions. (If MAX() is needed, wrap whole CASE. – spencer7593 Jul 18 '14 at 19:50
  • @spencer7593 . . . You are so right. There is now a `group by` with a conditional aggregation. I find this problem hard to think about because it is backwards. Usually, I would want to collapse the rows where things are the same, rather than keep those. – Gordon Linoff Jul 18 '14 at 19:52
3

You could divide this into two separate queries and join everything with UNION:

SELECT es.id, es.acct_num, p.id, p.name, COUNT(p.id) AS count
FROM ... es JOIN ... p
GROUP BY es.acct_num
HAVING count > 1

UNION

SELECT 0 AS es.id, es.acct_num, 0 AS p.id, 'MULTI-SITE' AS p.name, COUNT(p.id) AS count
FROM ... es JOIN ... p
GROUP BY es.acct_num
HAVING count = 0
freedomn-m
  • 27,664
  • 8
  • 35
  • 57
Crozin
  • 43,890
  • 13
  • 88
  • 135
  • I like the idea of a UNION, but this doesn't quite work for me. It's still grouping by acct_num in both select statements. If I'm going to use a UNION, I would need 1 result set to return everything that wouldn't fall into the "MULTI-SITE" statement (i.e. all energy sources iterated and NOT grouped), and the other should return ONLY the stuff that meets the "MULTI-SITE" conditions. – Hamking Jul 18 '14 at 18:03
3

Sorry for the delay with this one.

The solution I settled on was seemingly a hybrid of the 2 ideas. A UNION between 2 queries that had nested within them a sub-query that first acted as a filter to exclude the "MULTI-SITE" results, and in the second case included ONLY these results:

SELECT col1,col2,etc... from table WHERE acct_num NOT IN (

SELECT MULTISITE.es_acct_num FROM (
SELECT acct_num as es_acct_num,
       COUNT(DISTINCT p.id) as `cnt`
FROM blah, blah, blah
GROUP BY es.acct_num HAVING `cnt` > 1) as MULTISITE)

UNION

SELECT col1,col2,etc... from table WHERE acct_num IN (

SELECT MULTISITE.es_acct_num FROM (
SELECT acct_num as es_acct_num,
       COUNT(DISTINCT p.id) as `cnt`
FROM blah, blah, blah
GROUP BY es.acct_num HAVING `cnt` > 1) as MULTISITE)
GROUP BY es.acct_num -- notice this final group by clause sitting outside the "filter" rolls the multi-site results up into their respective groupings by acct_num

Both responses included great ideas but the solution I came up with was a combination of the 2 ideas, so I felt it best to mark the solution I'm currently using as the correct answer. However, I have still +1'd both responses above as I found them both very useful.

Hamking
  • 155
  • 1
  • 2
  • 8