How to not COUNT if a bunch of column meet some condition while GROUP BY twice?

Question

I have a table sample like this

SAMPLE's table

+------------+------------------------+--------+------+
| id_laporan | id_laporan_rekomendasi | status | id   |
+------------+------------------------+--------+------+
|          3 |                      2 |      2 | 01   |
|          3 |                      2 |      2 | 01   |
|          3 |                      2 |      2 | 01   |
|          3 |                      2 |      3 | 01   |
|          8 |                      3 |      2 | 01   |
|          8 |                      3 |      2 | 01   |
|          8 |                      4 |      2 | 01   |
|          7 |                      1 |      2 | 02   |
|          7 |                      1 |      2 | 02   |
|          7 |                      1 |      2 | 02   |
|          7 |                      1 |      3 | 02   |
|          7 |                      5 |      2 | 02   ||
|          7 |                      5 |      3 | 02   |
+------------+------------------------+--------+------+

i want to GROUP BY column id and COUNT/SUM how many id when some column meet some condition. For making may question more well-known, First i will make the id into two section

Id: 01

+------------+------------------------+--------+------+
| id_laporan | id_laporan_rekomendasi | status | id   |
+------------+------------------------+--------+------+
|          3 |                      2 |      2 | 01   |
|          3 |                      2 |      2 | 01   |
|          3 |                      2 |      2 | 01   |
|          3 |                      2 |      3 | 01   |
|          8 |                      3 |      2 | 01   |
|          8 |                      3 |      2 | 01   |
|          8 |                      4 |      2 | 01   |

Id: 02

|          7 |                      1 |      2 | 02   |
|          7 |                      1 |      2 | 02   |
|          7 |                      1 |      2 | 02   |
|          7 |                      1 |      3 | 02   |
|          7 |                      5 |      2 | 02   ||
|          7 |                      5 |      3 | 02   |
+------------+------------------------+--------+------+

Take a look at the id : 01 section first. You could see for column id_laporan in id : 01 section, it has id 3 and 8, and beside id_laporan, there is column id_laporan_rekomendasi. For id_laporan : 3 it has id_laporan_rekomendasi: 2 and for id_laporan : 8 it has id_laporan_rekomendasi: 3 and 4.

NOW, each row in column id_laporan_rekomendasi, have status ( a column beside of id_laporan_rekomendasi ). In short, what i explain before is just like these:

Id_laporan:3

+------------+------------------------+--------+
| id_laporan | id_laporan_rekomendasi | status |
+------------+------------------------+--------+
|          3 |                      2 |      2 |
|          3 |                      2 |      2 |
|          3 |                      2 |      2 |
|          3 |                      2 |      3 |

id_laporan:8

+------------+------------------------+--------+
| id_laporan | id_laporan_rekomendasi | status |
+------------+------------------------+--------+
|          8 |                      3 |      2 |
|          8 |                      3 |      2 | 
|          8 |                      4 |      2 |

Each id_laporan_rekomendasi have status. For each different id in id_laporan_rekomendasi and on the right column, column status, have at least one data 3, not 2 or other numbers, it will count as 1 else, it will count as 0

so for Id_laporan:3 it count as 1, for Id_laporan:8 it count as 0 because even **Id_laporan:8** has two different id for column id_laporan_rekomendasi but column status have no data 3 then it count as 0. For example, the table looks like this

SAMPLE

id_laporan:8

+------------+------------------------+--------+
| id_laporan | id_laporan_rekomendasi | status |
+------------+------------------------+--------+
|          8 |                      3 |      2 |
|          8 |                      3 |      3 | 
|          8 |                      4 |      3 |

Then it will count as 2 because for each id_laporan_rekomendasi which is different id, it will count as 1. We could assume the table meet the condition ( as i explained before, above this sample ).that means for **Id_laporan:3** and **Id_laporan:8** according to my explanation, they have data 1 and 0. after i count these, i have to SUM both of those (that 1 and 0) data for column id:01. These method was same with id:02. so the expected output will looks like below.

Other scenario

So what happen if the status 3 have already in the middle of id_laporan_rekomendasi ? , the id will definitely skipped status = 3 at the middle of id_laporan_rekomendasi. Actually, there is no a scenario where you can found two data 3 in the same id_laporan_rekomendasi . it will looks like this

+------------+------------------------+--------+
| id_laporan | id_laporan_rekomendasi | status |
+------------+------------------------+--------+
|          3 |                      2 |      2 | // 01 : 0, because status = 2
|          3 |                      2 |      3 | // 01 : 1, calculation function works because status = 3
|          3 |                      2 |      2 | // 01 : 1, because status = 2
|          3 |                      2 |      2 | // 01 : 1, because status = 2

IMPORTANT NOTE : Each different id_laporan_rekomendasi have only one data status = 3, so there is no scenario where status = 3 were found twice in the same id_laporan_rekomendasi like below scenario

False scenario

+------------+------------------------+--------+
| id_laporan | id_laporan_rekomendasi | status |
+------------+------------------------+--------+
|          3 |                      2 |      2 | 
|          3 |                      2 |      3 | // Data status = 3, okay, for id_laporan_rekomendasi = 2, i dont need to check for the rest of id_laporan_rekomendasi's status, good grief, i will skip to the next id_laporan_rekomendasi` maybe, i will check id_laporan_rekomendasi = 3 
|          3 |                      2 |      3 | // what is this... ther is no way...
|          3 |                      2 |      2 |

This is the output that i expect for my case

+------+--------------+
| id   | count        |
+------+--------------+
| 01   |      1       |     
| 02   |      2.      |
+------+--------------+

Why the result i expect is just like above tables, based from the first table, the calculation is should be like this:

+------------+------------------------+--------+------+
| id_laporan | id_laporan_rekomendasi | status | id   |
+------------+------------------------+--------+------+
|          3 |                      2 |      2 | 01   | // 01 : 0, because status 2
|          3 |                      2 |      2 | 01   | // 01 : 0, because status 2
|          3 |                      2 |      2 | 01   | // 01 : 0, because status 2
|          3 |                      2 |      2 | 01   | // 01 : 0, because status 2
|          3 |                      2 |      2 | 01   | // 01 : 0, because status 2
|          3 |                      2 |      3 | 01   | // 01 : 1, at this point the calculation works because status 3
|          8 |                      3 |      2 | 01   | // 01 : 1, because status 2
|          8 |                      3 |      2 | 01   | // 01 : 1, because status 2 
|          8 |                      4 |      2 | 01   | // 01 : 1, because status 2 
|          7 |                      1 |      2 | 02   | // 02 : 0, because status 2 
|          7 |                      1 |      2 | 02   | // 02 : 0, because status 2 
|          7 |                      1 |      2 | 02   | // 02 : 0, because status 2 
|          7 |                      1 |      3 | 02   | // 02 : 1, at this point the calculation works because status 3
|          7 |                      5 |      2 | 02   | // 02 : 1, because status 2  
|          7 |                      5 |      3 | 02   | // 02 : 2, at this point the calculation works because status 3
+------------+------------------------+--------+------+

I have tried like this query

SELECT id, count(id) from sample group by `id`

But the result will like below

+------+--------------+
| id   | count        |
+------+--------------+
| 01   |      7       |     
| 02   |      6       |
+------+--------------+

I know that i have to use CASE function inside count but i dont know how to do it with my complex table.

You are on the right track with `case` inside `count`. Show the query you have tried. Maybe read this to better undestand that concept: https://modern-sql.com/feature/filter (in "Conforming Alternatives" you will see the `case` trick) — Markus Winand, Mar 04 '18 at 11:24
"id:02 has no data 3 at least one for their status that's why it didn't have data." - But I see two rows with `id=02` and `status=3`. — Paul Spiegel, Mar 04 '18 at 11:46
Do you have a table which holds all unique IDs (01, 02, ..)? — Paul Spiegel, Mar 04 '18 at 13:25
From your example it's not clear if you want to count distinct values in `id_laporan_rekomendasi` per `id_laporan` or together. — Paul Spiegel, Mar 04 '18 at 13:57
@PaulSpiegel i want to count `id_laporan_rekomendasi` per `id_laporan` after thatn i want to sum how many data that have been calculate before with `group by` column `id`, so i need to know how many `id_laporan_rekomendasi` per `id_laporan` per `id` for each column `id` that have status = 3, which column `status` as child column of column `id_laporan_rekomendasi` and column `id_laporan_rekomendasi` as column child of `id_laporan` and as column child of `id` ( sorry about my bad grammar, i dont how to explain that very well ) — Gagantous, Mar 04 '18 at 15:01
@PaulSpiegel okay ihave edited again my question, please take a look — Gagantous, Mar 04 '18 at 15:22
It really looks like all you need is to count the rows with `status=3` per `id`. If this is the case - you couldn't explain it in a more complicated way ;-) — Paul Spiegel, Mar 04 '18 at 15:38
@PaulSpiegel actually, i working this using `php` but i dont know how to solved this using `php`, so i think ineed to solved this using query... :') — Gagantous, Mar 04 '18 at 15:49
@PaulSpiegel my question is similar to this, https://stackoverflow.com/questions/11145511/use-of-group-by-twice-in-mysql, but instead i need to sum just like my expected result looks like — Gagantous, Mar 05 '18 at 01:07
@GordonLinoff my question is similar to this, https://stackoverflow.com/questions/11145511/use-of-group-by-twice-in-mysql, but instead i need to sum just like my expected result looks like — Gagantous, Mar 05 '18 at 01:08

Gordon Linoff · Answer 1 · 2018-03-04T15:30:45.973

2

You want a conditional count(distinct). I think this is what you want:

select id, count(distinct case when status = 3 then id_laporan end)
from sample
group by id;

This counts the number of different id_laporan with a status of "3" for each id.

EDIT:

SQL tables represent unordered sets. Your results depend on ordering, but you do not have a column that specifies the ordering. You need one. If necessary, you could add one to the table as:

alter table add t_id int auto_increment;

But the resulting ordering might not be what you really want.

With such a column you can readily calculate cumulative number of "3"s for each id. Here is a method using correlated subqueries:

select t.*,
       (select count(*)
        from t t2
        where t2.id = t.id and
              t2.? <= t.? and status = 3
       ) 
from t;

The ? represents the column for the ordering. If you have a set of columns with the ordering, then that can be used as well.

edited Mar 04 '18 at 15:30

answered Mar 04 '18 at 11:37

Gordon Linoff

1,242,037
58
646
786

if i change `count` with `sum` is it okay ? – Gagantous Mar 04 '18 at 13:25
it just return `1` for id `02`, it has to be `2` because there is `2` data `3` in column `status` – Gagantous Mar 04 '18 at 14:45
what's the meaning by `ordering` that `?` part ? i am not quite get it, could you explain that more ? – Gagantous Mar 04 '18 at 15:44
could you give me example with `?` part ? – Gagantous Mar 04 '18 at 15:52
@Gagantous . . . The `?` is a column that represents the ordering of the rows in the table, as explained in the answer. – Gordon Linoff Mar 05 '18 at 02:37

Gagantous · Accepted Answer · 2018-03-05T02:33:31.273

I got the answer, thanks to Gordon Linoff and Paul Spiegel for the DISTINCT and CASE in COUNT() function, i really appreciate (y)

I know that there is still a bug on this code, but i least it help to improvise this answer myself

SELECT t1.id , sum(tot) from ( 
  SELECT id, id_laporan, id_laporan_rekomendasi, 
  COUNT(distinct case when status = 3 then 1 end) as tot
  FROM sample t1
GROUP by id_laporan_rekomendasi ) t1
GROUP BY t1.id

Result

| id | sum(tot) |
|----|----------|
| 01 |        1 |
| 02 |        2 |

Here is the SQLFiddle Demo

How to not COUNT if a bunch of column meet some condition while GROUP BY twice?

2 Answers2