33

I have two selects that I'm currently running as a UNION successfully.

(SELECT a.user_id,
          a.updatecontents AS city,
          b.country
   FROM userprofiletemp AS a
   LEFT JOIN userattributes AS b ON a.user_id=b.user_id
   WHERE typeofupdate='city')
UNION DISTINCT
  (SELECT a.user_id,
          c.city,
          c.country
   FROM userverify AS a
   LEFT JOIN userlogin AS b ON a.user_id=b.user_id
   LEFT JOIN userattributes AS c ON a.user_id=c.user_id
   WHERE b.active=1
     AND a.verifycity=0);

The results come back like this:

100 Melbourne Australia
200 NewYork America
300 Tokyo Japan
100 Sydney Australia

The catch is the query will bring duplicate user_id (in this case the 100). The details in the first query take precedent for me and if the user_id is repeated in the second query I don't need it.

Is there a way to get a UNION to be DISTINCT on a column? in this case the user_id? Is there a way to do the above call and not get duplicate user_id's - drop the second. Should I re-write the query differently and not use a UNION. Really want it as one query - I can use to SELECT's and PHP to weed out duplicate if necessary.

thx Adam

Ankit Jindal
  • 3,672
  • 3
  • 25
  • 37
Adam
  • 19,932
  • 36
  • 124
  • 207

3 Answers3

38

No. You cannot specify which exact field you need to distinct with. It only works with the whole row.

As of your problem - just make your query a subquery and in outer one GROUP BY user_id

SELECT * FROM 
(SELECT a.user_id,a.updatecontents as city,b.country
FROM userprofiletemp AS a
LEFT JOIN userattributes AS b ON a.user_id=b.user_id
WHERE typeofupdate='city')

UNION DISTINCT

(SELECT a.user_id,c.city,c.country
FROM userverify AS a
LEFT JOIN userlogin AS b ON a.user_id=b.user_id
LEFT JOIN userattributes AS c ON a.user_id=c.user_id
WHERE b.active=1 AND a.verifycity=0) x
GROUP BY user_id
zerkms
  • 249,484
  • 69
  • 436
  • 539
  • any chance you can give me a push in the right direction with the subquery and outer one group by user_id? haven't done that before, testing but not getting there quickly... – Adam Aug 06 '11 at 07:16
  • @Adam: updated an answer. That is how the query could look like, but have no chance to check if it is correct – zerkms Aug 06 '11 at 07:20
  • If I'd like to find out what the 'x' in 'x GROUP BY user_id' means (also I've read 'a order by ...' somewhere), which docs/tutorial/link would you recommend please? – Jakub Turcovsky Sep 11 '14 at 15:33
  • 1
    @2rec: `x` is an alias for the nested query that precedes it, since every one must have an alias. – zerkms Sep 11 '14 at 20:04
  • @zerkms I see, thank you. So this could be any string I like, it doesn't matter what I put there, right? – Jakub Turcovsky Sep 12 '14 at 07:15
4

MySQL UNION produces distinct rows—however, all column values in the row need to be distinct. If you wish to limit the distinction to a single or a few columns, when other columns are not distinct, you can wrap the UNION in a sub-query and GROUP BY the sub-query by the columns you wish to be unique.

Here I wrap the entire UNION in a sub-query, give it an alias, then GROUP BY the desired unique column:

SELECT * FROM (

SELECT a.user_id,a.updatecontents as city,b.country
FROM userprofiletemp AS a
LEFT JOIN userattributes AS b ON a.user_id=b.user_id
WHERE typeofupdate='city'

UNION

SELECT a.user_id,c.city,c.country
FROM userverify AS a
LEFT JOIN userlogin AS b ON a.user_id=b.user_id
LEFT JOIN userattributes AS c ON a.user_id=c.user_id
WHERE b.active=1 AND a.verifycity=0

) aa GROUP BY user_id;

If you have more than one column you'd like to include in the distinction, list them after the GROUP BY: such as GROUP BY user_id, city.


SIDE NOTE: since, in this case, UNION does not provide the desired distinction, there is no benefit to simply using UNION, and apparently "UNION ALL is much faster than UNION", therefore you can use UNION ALL to speed up this query.

bloodyKnuckles
  • 11,551
  • 3
  • 29
  • 37
2
(SELECT a.user_id,a.updatecontents as city,b.country
FROM userprofiletemp AS a
LEFT JOIN userattributes AS b ON a.user_id=b.user_id
WHERE typeofupdate='city')

UNION ALL

(SELECT a.user_id,c.city,c.country
FROM userverify AS a
LEFT JOIN userlogin AS b ON a.user_id=b.user_id
LEFT JOIN userattributes AS c ON a.user_id=c.user_id
WHERE b.active=1 AND a.verifycity=0
  AND a.user_id NOT IN
      ( SELECT user_id
        FROM userprofiletemp 
        WHERE typeofupdate='city'
      ) 
);
ypercubeᵀᴹ
  • 113,259
  • 19
  • 174
  • 235