1

I've got a Transact SQL problem which I don't understand.

I have 2 tables tblMedewerker2 and tblMedewerker3.

tblMedewerker2 has got the following values for employeenumber :129, 143,144,145,146,147,169.

tblMedewerker3 has got the following values for employeenumber: 129, 143,144,145,146,147, 166,167,168.

They contain 7 respectively 9 rows, so the values are unique.

The following query yields 63 rows :

select  
        a.employeenumber as emp_a
        , b.employeenumber as emp_b 
        , isnull(a.employeenumber, b.employeenumber) as single_employeenumber
from tblMedewerker2 a
full join
tblMedewerker3 b
on exists
(
select a.employeenumber from tblMedewerker2
union
select b.employeenumber from tblmedewerker3
)

whereas this query yields 10 rows:

select  
        a.employeenumber as emp_a
        , b.employeenumber as emp_b 
        , isnull(a.employeenumber, b.employeenumber) as single_employeenumber
from tblMedewerker2 a
full join
tblMedewerker3 b
on exists
(
select a.employeenumber from tblMedewerker2
intersect
select b.employeenumber from tblmedewerker3
)

Why would the first query turn the SQL into some sort of CROSS JOIN ?

I would say the exists just gives back a TRUE or a FALSE. So why the difference in numbers of records in both queries ?

Thanks ! Rgds BB

Progman
  • 16,827
  • 6
  • 33
  • 48
Billybob49
  • 11
  • 1
  • Another solution might be to split the problem into two parts - find the distinct/unique set of employee numbers from the two tables, perhaps as a common table expression and them join this as a left join to both table a and table b; this gets round the problem of dealing with a cartesian/full join – Peter Smith Dec 23 '20 at 21:08
  • It's useful to others if you could indicate if and how you have been helped. – Peter Smith Dec 24 '20 at 07:50

2 Answers2

0

It comes down to an how all JOINs work. Let's think about a simple INNER JOIN

SELECT a.id, b.id
FROM
a
INNER JOIN
b 
ON a.id = b.id

This is saying "Compare every row to every row in the other table. When the ON condition is true, include the rows joined together in the result"

Now consider the following valid query:

SELECT a.id, b.id
FROM
a
INNER JOIN
b 
ON 1==1

Again, the way it works is as described above. "Compare every row to every row in the other table. When the ON condition is true, include the rows joined together in the result". In this case the ON condition is true for all the comparisons.

So if the left table had 7 rows and the right table has 9, you'll get 63 rows. (I put it in a SQL Fiddle for you here to see for yourself: http://sqlfiddle.com/#!18/87097/17)

Your ON EXISTS statement in your first query is always going to be true, since any row in the tables you are joining can be found in the UNION. Its very similar to my 1==1 example above. The fact that it's a FULL JOIN in the first query doesn't matter. If it's a LEFT JOIN or INNER JOIN or FULL JOIN it will return 63 rows.

In your second query the ON condition is only true in a limited set of circumstances: When the row being evaluated happens to be in the intersection of both tables.

As a sidenote. Your second query can be simplified to a usual ON clause comparing the employeeNumbers. This is because the ON clause is really taking the intersection of both tables. You can write your second query as:

select  
        a.employeenumber as emp_a
        , b.employeenumber as emp_b 
        , isnull(a.employeenumber, b.employeenumber) as single_employeenumber
from tblMedewerker2 a
full join
tblMedewerker3 b
on a.employeeNumber = b.employeeNumner
k29
  • 641
  • 6
  • 26
0

As mentioned in my comment above another solution (rather than an answer to cartesian products and set intersection part of your question) might be based on a different approach - this is just sketched out and not tested (it's late here and I'm tired):

Generate a CTE of employee IDs and LEFT JOIN this to each table

WITH EmployeeNumbers AS
SELECT DISTINCT employeenumber
FROM
(SELECT employeenumber FROM tblMedewerker2
UNION ALL
SELECT employeenumber FROM tblMedewerker2
) AS p

SELECT
    t2.employeenumber  AS empA,
    t3.employeenumber  AS empB,
    ISNULL(t2.employeenumber, t3.employeenumber) AS single_employeenumber
    -- an alternative to the above line
    -- EN.employeenumber AS single_employeenumber
FROM
    EmployeeNumbers  AS EN
    LEFT JOIN
    tblMedewerker2 AS T2 ON EN.employeenumber = T2.employeenumber 
    LEFT JOIN
    tblMedewerker3 AS T2 ON EN.employeenumber = T2.employeenumber 

Peter Smith
  • 5,528
  • 8
  • 51
  • 77