0

I've searched for this and can't seem to find an answer. I apologize in advance as I'm sure this answer is out there, but I can't seem to find it.

I'm working with a SQL Server 2005 DB, and I'm aware that the query below doesn't represent a normalized DB, since the numPlacements field is in both the detail and rollup table. I didn't create the DB.

The below SQL gives the expected result when the where clause is used. The expected result is all rows where a matching value is missing from either table, or the two values don't match.

However, if I comment the where clause and uncomment the final AND in the ON clause, it returns over 200k rows instead of the expected 120 results.

SELECT CASE WHEN A.ID is NULL THEN B.ID ELSE A.ID END,
       A.numPlacements AS 'AnumPlacements', 
       B.numPlacements  AS 'bnumPlacements',
       B.numPlacements - A.numPlacements as 'Variance'
FROM   (SELECT ID,
               Sum(numPlacements) AS 'numPlacements' 
        FROM   PlacementDetailLevel
        GROUP  BY ID) A 
       FULL OUTER JOIN (SELECT ID,
                               Sum(numPlacements) AS 'numPlacements' 
                        FROM   PlacementRollupLevel
                        GROUP  BY ID) B 
                    ON A.ID = B.ID 
                       --AND B.numPlacements <> A.numPlacements 
 WHERE  A.numPlacements <> B.numPlacements or A.numPlacements is null or B.numPlacements is null

Any ideas as to why?

More detail below based on ypercube's suggestion:

I created TableA and TableB. They look like this:

TableA
ID  numPlacements
1   10
2   20
3   30
4   40

TableB
ID  numPlacements
2   20
3   31
4   40
5   50

Note that the differences are TableA has no #5, TableB has no #1, and #3 has a different numPlacements in both.

SELECT CASE WHEN A.ID is NULL THEN B.ID ELSE A.ID END AS 'ID',
       A.numPlacements AS 'AnumPlacements', 
       B.numPlacements  AS 'BnumPlacements',
       B.numPlacements - A.numPlacements as 'Variance'
FROM   (SELECT ID,
               Sum(numPlacements) AS 'numPlacements' 
        FROM   TableA
        GROUP  BY ID) A 
       FULL OUTER JOIN (SELECT ID,
                               Sum(numPlacements) AS 'numPlacements' 
                        FROM   TableB
                        GROUP  BY ID) B 
                    ON A.ID = B.ID 

The above produces exactly what I'd expect:

ID  AnumPlacements  BnumPlacements  Variance
1   10              NULL            NULL
2   20              20              0
3   30              31              1
4   40              40              0
5   NULL            50              NULL

Let's try adding the WHERE clause.

SELECT CASE WHEN A.ID is NULL THEN B.ID ELSE A.ID END AS 'ID',
       A.numPlacements AS 'AnumPlacements', 
       B.numPlacements  AS 'BnumPlacements',
       B.numPlacements - A.numPlacements as 'Variance'
FROM   (SELECT ID,
               Sum(numPlacements) AS 'numPlacements' 
        FROM   TableA
        GROUP  BY ID) A 
       FULL OUTER JOIN (SELECT ID,
                               Sum(numPlacements) AS 'numPlacements' 
                        FROM   TableB
                        GROUP  BY ID) B 
                    ON A.ID = B.ID 
 WHERE  A.numPlacements <> B.numPlacements or A.numPlacements is null or B.numPlacements is null

With the where, we get the three expected rows:

ID  AnumPlacements  BnumPlacements  Variance
1   10              NULL            NULL
3   30              31              1
5   NULL            50              NULL

Let's try adding the AND.

SELECT CASE WHEN A.ID is NULL THEN B.ID ELSE A.ID END AS 'ID',
       A.numPlacements AS 'AnumPlacements', 
       B.numPlacements  AS 'BnumPlacements',
       B.numPlacements - A.numPlacements as 'Variance'
FROM   (SELECT ID,
               Sum(numPlacements) AS 'numPlacements' 
        FROM   TableA
        GROUP  BY ID) A 
       FULL OUTER JOIN (SELECT ID,
                               Sum(numPlacements) AS 'numPlacements' 
                        FROM   TableB
                        GROUP  BY ID) B 
                    ON A.ID = B.ID 
                       AND B.numPlacements <> A.numPlacements 

Now, if we try it with the above AND in the join, I would expect to get row #3. Instead, I get this:

ID  AnumPlacements  BnumPlacements  Variance
1   10              NULL            NULL
2   NULL            20              NULL
2   20              NULL            NULL
3   30              31              1
4   NULL            40              NULL
4   40              NULL            NULL
5   NULL            50              NULL
John Tarr
  • 717
  • 1
  • 9
  • 21
  • 3
    well, how can you expect the same results if you remove the `where` **and** a join condition? – Lamak May 07 '15 at 15:23
  • You changed several predicates why would you expect the results to be the same? If you removed a column it wouldn't return that column either. – Zane May 07 '15 at 15:31
  • I estimate that your tables have about 400-500 rows each? – ypercubeᵀᴹ May 07 '15 at 17:27
  • @ypercube - tens of thousands each. I can query when I get back to work. – John Tarr May 08 '15 at 16:38
  • @Zane - If I simply put " where A.numPlacements <> B.numPlacements " in the where, then it would essentially negate the FULL OUTER JOIN, correct? Whereas I was under the impression that the correct way to filter and also still get the nulls is to put the condition in the ON clause. Either way, it doesn't explain to me how I get MORE results by placing it in the ON than if I never put it in the ON or WHERE. – John Tarr May 08 '15 at 16:43
  • @Lamak - I didn't remove the where AND a join condition. I said to comment the WHERE and uncomment the final AND in the ON clause. – John Tarr May 08 '15 at 16:44
  • Yeah, I should have said that you have about 400-500 different IDs in the tables. – ypercubeᵀᴹ May 08 '15 at 17:14
  • Also check my answer here: http://stackoverflow.com/questions/6630887/using-is-null-or-is-not-null-on-join-conditions-theory-question/6631535#6631535 Then, I suggest as an exercise, put 3-4 rows in each table (tables similar to yours) and try - on paper first and on the server next - to work what the result would be of the FULL join query. – ypercubeᵀᴹ May 08 '15 at 23:56
  • @ypercube - Thanks for the link to your answer. It was a great explanation, but I still have questions in the back of my mind for my question. I will try to work it out with small tables and refine the question. I n the meantime, how do I mark your answer as the answer? – John Tarr May 11 '15 at 13:54
  • @ypercube - If you check out my updates, I tried what you suggested. Everything makes sense except the last run. If you have any advice, I'd appreciate it. – John Tarr May 11 '15 at 14:39

0 Answers0