1

Suppose there are the following rows

| Id | MachineName | WorkerName | MachineState |
|----------------------------------------------|
| 1  | Alpha       | Young      |  RUNNING     |
| 1  | Beta        |            |  STOPPED     |
| 1  | Gamma       | Foo        |  READY       |
| 1  | Zeta        | Zatta      |              |
| 2  | Guu         | Niim       |  RUNNING     |
| 2  | Yuu         | Jaam       |  STOPPED     |
| 2  | Nuu         |            |  READY       |
| 2  | Faah        | Siim       |              |
| 3  | Iem         |            |  RUNNING     |
| 3  | Nyt         | Fish       |  READY       |
| 3  | Qwe         | Siim       |              |

We want to merge these rows according to following priority :

STOPPED > RUNNING > READY > (null or empty)

If a row has a value for greatest priority, then value from that row should be used (only if it is not null). If it is null, a value from any other row should be used. The rows should be grouped by id

The correct output for the above input is :

| Id | MachineName | WorkerName | MachineState |
|----------------------------------------------|
| 1  | Beta        | Foo        |  STOPPED     |
| 2  | Yuu         | Jaam       |  STOPPED     |
| 3  | Iem         | Fish       |  RUNNING     |

What would be a good sql query to accomplish this? I tried using joins, but it did not work out.

Ace McCloud
  • 701
  • 1
  • 5
  • 16
  • Can you show what was your query? – Jorge Campos Mar 08 '16 at 12:04
  • Which record(s) do you want in the resultset if/when there are multiple records with the maximal `(Id,MachineState)` combination? – eggyal Mar 08 '16 at 12:07
  • @eggyal There will never be such a case. – Ace McCloud Mar 08 '16 at 12:09
  • Your problem indicates poor schema design. If `WorkerName` is not genuinely a property of the row in which it is held (e.g. the value you want for `WorkerName` when retrieving a row may actually be that from some other row) then it doesn't really belong there. You should consider redesigning the schema to better reflect the semantics of your business problem. – eggyal Mar 08 '16 at 12:24
  • Also, I've just noticed that this question is tagged both [tag:mysql] and [tag:psql] which are competing RDBMS products from different vendors. Which are you actually using? – eggyal Mar 09 '16 at 11:19
  • I am using psql. It got tagged mysql accidentally. Removed that tag. – Ace McCloud Mar 10 '16 at 11:08

3 Answers3

1

This is a prioritization query. One method uses variables. Another uses union all . . . this works if the states are not repeated for a given id:

select t.*
from table t
where machinestate = 'STOPPED'
union all
select t.*
from table t
where machinestate = 'RUNNING' and
      not exists (select 1 from table t2 where t2.id = t.id and t2.machinestate in ('STOPPED'))
union all
select t.*
from table t
where machinestate = 'READY' and
      not exists (select 1 from table t2 where t2.id = t.id and t2.machinestate in ('STOPPED', 'RUNNING'));
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
1

You can view this as a case of the group-wise maximum problem, provided you can obtain a suitable ordering over your MachineState column—e.g. by using a CASE expression:

SELECT   a.Id,
         COALESCE(a.MachineName, t.MachineName) MachineName,
         COALESCE(a.WorkerName , t.WorkerName ) WorkerName,
         a.MachineState
FROM     myTable a JOIN (
           SELECT   Id,
                    MIN(MachineName) AS MachineName,
                    MIN(WorkerName ) AS WorkerName,
                    MAX(CASE MachineState
                      WHEN 'READY'   THEN 1
                      WHEN 'RUNNING' THEN 2
                      WHEN 'STOPPED' THEN 3
                    END) AS MachineState
           FROM     myTable
           GROUP BY Id
         ) t ON t.Id = a.Id AND t.MachineState = CASE a.MachineState
           WHEN 'READY'   THEN 1
           WHEN 'RUNNING' THEN 2
           WHEN 'STOPPED' THEN 3
         END

See it on sqlfiddle:

| id | machinename | workername | machinestate |
|----|-------------|------------|--------------|
|  1 |        Beta |        Foo |      STOPPED |
|  2 |         Yuu |       Jaam |      STOPPED |
|  3 |         Iem |       Fish |      RUNNING |

You could save yourself the pain of using CASE if MachineState was an ENUM type column (defined in the appropriate order). It so happens in this case that a simple lexicographic ordering over the string value will yield the same result, but that's a coincidence on which you really shouldn't rely as it's bound to slip under the radar when someone tries to maintain this code in the future.

eggyal
  • 122,705
  • 18
  • 212
  • 237
0

change MachineState as enum:

`MachineState` enum('READY','RUNNING','STOPPED') DEFAULT NULL

and sql is simple:

select t.id,state.machinename,state.workername,t.mstate from state,(select id,max(MachineState) mstate from state group by Id) t  where t.mstate=state.machinestate and t.id=state.id;
chenyb999
  • 139
  • 4