0

Edited

I'm not asking How to write good query but those 3 queries return same results.

Query 1

SELECT v1.id
FROM (
    SELECT DISTINCT t1.id
    FROM t1 LEFT JOIN t2 ON t1.id = t2.id
    WHERE t2.id IS NULL
) v1 INNER JOIN (
    SELECT DISTINCT t3.id
    FROM t3 LEFT JOIN t4 ON t3.id = t4.id
    WHERE t4.id IS NULL
) v2 ON v1.id = v2.id;

Query 2

SELECT DISTINCT t1.id
FROM (t1 LEFT JOIN t2 ON t1.id = t2.id)
    INNER JOIN (t3 LEFT JOIN t4 ON t3.id = t4.id) ON t1.id = t3.id
WHERE t2.id IS NULL AND t4.id IS NULL;

Query 3

SELECT DISTINCT t1.id
FROM t1 LEFT JOIN t2 ON t1.id = t2.id
    INNER JOIN t3 ON t1.id = t3.id LEFT JOIN t4 ON t3.id = t4.id
WHERE t2.id IS NULL AND t4.id IS NULL;

Queries are not hard coded by programmer but generate dynamically by user input.

For example when user inserts find id in t1 (but not in t2) and in t3 (but not in t4), his indention is Query 1. But currently my program generates Query 3 and it looks like OK. I'm wondering this query has a bug in some cases, so that should be changed like Query 2 or 1.

User input (shown above) is just example and converting user input to JOIN statement is difficult at last to me.

Thanks in advanced.

Jason Heo
  • 9,956
  • 2
  • 36
  • 64
  • Regarding revised/clarified question: I believe that *Query 3* will produce different results to *Query 1* if there are multiples of the same id value in the tables. – Turophile Sep 19 '14 at 05:41
  • @Turophile Oops. It was *typo*. fixed. Thanks. – Jason Heo Sep 19 '14 at 06:09
  • what exactly is the question? is this it: **"I'm wondering this query has a bug in some cases"** Do you have any failure test cases? – Paul Maxwell Sep 19 '14 at 10:05
  • @Used_By_Already Thank you for having time. **"what exactly is the question?"** => I want to know 3 queries are same. **"Do you have any failure test cases?"** => Not yet. But I'm not sure TC is succeeded because Test Data Sets are insufficient. – Jason Heo Sep 20 '14 at 01:32
  • Similar question is http://stackoverflow.com/questions/5009573/does-the-order-of-tables-in-a-join-matter-when-left-outer-joins-are-used. His is asking 3 queries are exactly equivalent to other permutations in the FROM subclause. I just want to know Query 1~3 are same. Sorry for poor English. Thanks. – Jason Heo Sep 20 '14 at 01:38

3 Answers3

1

This is the original query:

SELECT v1.id
FROM (SELECT DISTINCT t1.id
      FROM t1 LEFT JOIN t2 ON t1.id = t2.id
      WHERE t2.id IS NULL
     ) v1 INNER JOIN
     (SELECT DISTINCT t3.id
      FROM t3 LEFT JOIN t4 ON t3.id = t4.id
      WHERE t4.id IS NULL
     ) v2
     ON v1.id = v2.id;

If I understand correctly, you want ids that are in t1 and t3, but not in t2 and t4.

I would express the second query as:

SELECT distinct t1.id
FROM t1 INNER JOIN
     t3
     on t1.id = t3.id LEFT JOIN
     t2
     on t1.id = t2.id LEFT JOIN
     t4
     on t1.id = t4.id
WHERE t2.id IS NULL AND t4.id IS NULL;
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • Sorry for late reply. "you want ids that are in t1 and t3, but not in t2 and t4" => Yes! I've updated my question. – Jason Heo Sep 19 '14 at 02:21
1

I read your original query as asking for "all ids in T1 which are in both T1 and T3 but not in T2 or T4". Is that correct? If so, my query would be:

SELECT DISTINCT t1.id
FROM t1
WHERE EXISTS (SELECT 1 FROM t3 WHERE t1.id = t3.id)
AND  NOT  EXISTS  (SELECT 1 FROM t2 WHERE t1.id = t2.id)
AND  NOT  EXISTS  (SELECT 1 FROM t4 WHERE t1.id = t4.id)
Turophile
  • 3,367
  • 1
  • 13
  • 21
  • Thank you for having time. `SEMI JOIN` is considered for one-to-many relationships in next product. Also I've updated my question. – Jason Heo Sep 19 '14 at 02:23
1

Not sure what the real objective is, nor which dbms is being targeted

 -- oracle
SELECT id FROM ( SELECT id FROM t1 MINUS SELECT id FROM t2 ) a

INTERSECT

SELECT id FROM ( SELECT id FROM t3 MINUS SELECT id FROM t4 ) b
;

 -- sql server
SELECT id FROM ( SELECT id FROM t1 EXCEPT SELECT id FROM t2 ) a

INTERSECT

SELECT id FROM ( SELECT id FROM t3 EXCEPT SELECT id FROM t4 ) b
;

SELECT c.id from (

    SELECT t1.id
    FROM t1 LEFT JOIN t2 ON t1.id = t2.id
    WHERE t2.id IS NULL

    UNION ALL

    SELECT t3.id
    FROM t3 LEFT JOIN t4 ON t3.id = t4.id
    WHERE t4.id IS NULL

) c GROUP BY c.id HAVING count(*) >= 2
;

The next one is potentially quite efficient BUT has a caveat

-- conditions apply, assumes t2 and t4 cannot have ids not in t1 or t3 respectively
SELECT c.id from (

    SELECT a.id from (
      (SELECT ID FROM T1)
      UNION ALL 
      (SELECT ID FROM T2)
    ) a GROUP BY a.id HAVING count(*) = 1

    UNION ALL

    SELECT b.id from (
      (SELECT ID FROM T3)
      UNION ALL 
      (SELECT ID FROM T4)
    ) b GROUP BY b.id HAVING count(*) = 1

) c GROUP BY c.id HAVING count(*) >= 2
;

Fiddles: | Oracle | SQL Server | MySQL |

Paul Maxwell
  • 33,002
  • 3
  • 32
  • 51