Use of GROUP BY twice in MySQL

Question

My table looks like this.

Location    Head    Id  IntTime
1           AMD     1   1
2           INTC    3   3
3           AMD     2   2
4           INTC    4   4
5           AMD2    1   0
6           ARMH    5   1
7           ARMH    5   0
8           ARMH    6   1
9           AAPL    7   0
10          AAPL    7   1

Location is the primary key. I need to GROUP BY Head and by Id and when I use GROUP BY, I need to keep the row with the smallest IntTime.

After the first GROUP BY Id, I should get (I keep the smallest IntTime)

Location    Head    Id  IntTime
2           INTC    3   3
3           AMD     2   2
4           INTC    4   4
5           AMD2    1   0
7           ARMH    5   0
8           ARMH    6   1
9           AAPL    7   0

After the second GROUP BY Head, I should get (I keep the smallest IntTime)

Location    Head    Id  IntTime
2           INTC    3   3
3           AMD     2   2
5           AMD2    1   0
7           ARMH    5   0
9           AAPL    7   0

When I run the following command, I keep the smallest IntTime but the rows are not conserved.

SELECT Location, Head, Id, MIN(IntTime) FROM test 
GROUP BY Id

Also, to run the second GROUP BY, I save this table and do again

SELECT Location, Head, Id, MIN(IntTime) FROM test2 
GROUP BY Head

Is there a way to combine both commands?

[Edit: clarification] The result should not contain two Head with the same value or two Id with the same value. When deleting those duplicates, the row with the smallest IntTime should be kept.

[MIN()](http://dev.mysql.com/doc/refman/5.0/en/group-by-functions.html#function_min) does not *order* the results, but simply pulls the *minimum value* for the column in question (in this case `IntTime`). All other columns are returned as if `MIN()` was never called. — 0b10011, Jun 21 '12 at 19:40
So to clarify what bfrohs issaying, if you run the first query, every row will have inttime=0. Either that, or you need to group by all columns that are being preserved. — David Manheim, Jun 21 '12 at 20:10
@DavidManheim, no, that's not what I'm saying. It pulls the minimum value from the column in question, **grouped by the GROUP BY columns**. So, the `IntTime` returned will be expected, but the other rows will not. — 0b10011, Jun 21 '12 at 20:13
i hate implicit aggregate function of MySQL for non-groupped columns (it behaves the same as `select choose_random(Location), choose_random(Head), Id, min(IntTime) from test group by Id`) — Aprillion, Jun 21 '12 at 20:25
BTW this is a variation on [MySQL **top 1 per group**](http://stackoverflow.com/search?q=MySQL+top+1+per+group) (lots of similar queations already answered) — Aprillion, Jun 21 '12 at 20:30
Does the solution by bfrohs work for you? If not, could you edit your question and try to explain in English what you are trying to accomplish? — Gordon Linoff, Jun 21 '12 at 21:38

0b10011 · Accepted Answer · 2012-06-26T15:37:13.793

This query returns the exact final results you're looking for (example):

SELECT `final`.*
FROM `tableName` AS `final`
JOIN (
    SELECT `thead`.`Id`, `Head`, MIN(`intTime`) AS `min_intTime`
    FROM `tableName` AS `thead`
    JOIN (
        SELECT `Id`, MIN(intTime) as min_intTime
        FROM `tableName` AS `tid`
        GROUP BY `Id`
    ) `tid`
    ON `tid`.`Id` = `thead`.`Id`
    AND `tid`.`min_intTime` = `thead`.`intTime`
    GROUP BY `Head`
) `thead`
ON `thead`.`Head` = `final`.`Head`
AND `thead`.`min_intTime` = `final`.`intTime`
AND `thead`.`Id` = `final`.`Id`

How it works

The innermost query groups by Id, and returns the Id and corresponding MIN(intTime). Then, the middle query groups by Head, and returns the Head and corresponding Id and MIN(intTime). The final query returns all rows, after being narrowed down. You can think of the final (outermost) query as a query on a table with only the rows you want, so you can do additional comparisons (e.g. WHERE final.intTime > 3).

Zane Bien · Answer 2 · 2012-06-21T21:02:26.567

4

SELECT a.*
FROM test a
NATURAL JOIN
(
    SELECT c.Head, c.Id, MIN(c.IntTime) AS IntTime
    FROM test c
    NATURAL JOIN
    (
        SELECT Id, MIN(IntTime) AS IntTime
        FROM test
        GROUP BY Id
    ) d
    GROUP BY c.Head
) b
ORDER BY a.Location

edited Jun 21 '12 at 21:02

answered Jun 21 '12 at 20:32

Zane Bien

22,685
6
45
57

score 1 · Answer 3 · answered Jun 21 '12 at 19:43

1

If I understand correctly, you want the rows that have the smallest int/time values by head and id. I am not seeing what the second "group by head" is getting you.

The following does what I think you want:

select t.*
from t join
     (select head, id, min(intTime) as min_intTime
      from t
      group by head, id
     ) tsum
     on tsum.head = t.head and
        tsum.id = t.id and
        tsum.min_intTime = t.intTime

You might just want the smallest intTime for all "head" values. If so, the query looks like:

select t.*
from t join
     (select head, min(intTime) as min_intTime
      from t
      group by head
     ) tsum
     on tsum.head = t.head and
        tsum.min_intTime = t.intTime

answered Jun 21 '12 at 19:43

Gordon Linoff

1,242,037
58
646
786

I added a clarification. I am trying to GROUP BY over the result of the precedent GROUP BY, by keeping at each time the row with the smallest IntTime – Youcha Jun 21 '12 at 19:54
The row with the smallest intTime after grouping by head, int and then grouping by head is the smallest intTime for head. The answer would be different if you were using sum or average, but the min commutes in this case. – Gordon Linoff Jun 21 '12 at 19:56
I'm not sure I understood everything but grouping by head, id then by id is not equivalent to grouping by head then by id. – Youcha Jun 21 '12 at 20:47
Not in general. But in the specific case of taking the minimum value of intTime, you get the same results. – Gordon Linoff Jun 21 '12 at 20:49
I ran your 2nd code over another example. INTC it that example has the same Id than ARMH and ARMH has a higher TimeInt. The query should not return ARMH. [SQL Fiddle Here](http://sqlfiddle.com/#!2/daa1b/1/0) – Youcha Jun 21 '12 at 21:13
Since ARMH only appears on one row in your example, shouldn't it be included in the output? The one row is the minimum (even after grouping by head/id and then head). Am I misunderstanding what you want? – Gordon Linoff Jun 21 '12 at 21:18
Ok, there was a misunderstanding. I do not want a HEAD or an ID to appear twice. If two HEAD are different but have the same ID, one has to be removed. – Youcha Jun 21 '12 at 21:21

score 0 · Answer 4 · answered Jun 21 '12 at 19:41

0

I am not sure how your sample result arrived at Location = 7 for ARMH, for instance. Try this...

SELECT MIN(Location), Head, Id, MIN(IntTime)
FROM test
GROUP BY Head, Id

answered Jun 21 '12 at 19:41

Web User

7,438
14
64
92

It should be Location = 7 because between 7 and 8, location = 7 has the smallest IntTime – Youcha Jun 21 '12 at 19:43
3

The problem with this appraoch is teh min(location) and min(IntTime) might come from different rows. – Gordon Linoff Jun 21 '12 at 19:43

score 0 · Answer 5 · answered Jun 21 '12 at 20:17

0

Use UNION syntax?

(SELECT Location, Head, Id, MIN(IntTime) FROM test 
GROUP BY Id)
UNION
(SELECT Location, Head, Id, MIN(IntTime) FROM test2 
GROUP BY Head)

answered Jun 21 '12 at 20:17

edwardmp

6,339
5
50
77

You didn't ask the question, how do you know? – edwardmp Jun 21 '12 at 20:23
1

Because the OP states in their question the *exact* results they're expecting? – 0b10011 Jun 21 '12 at 20:24

Use of GROUP BY twice in MySQL

5 Answers5

How it works

Linked