1

I have to compare two tables with identical structure (int not null,int not null, varchar2). In both tables field3 is a nullable.

I have the next SQL:

Select 
   t1.field1, t1.field2, t1.field3)  
From
   table1 t1
Where  (field1,field2,field3)
   not in 
   (select field1,
    field2,field3
    from table2 t2)

When the field3 is NULL in any of them (t1 or t2) the query does not return any row. For instance I want to return a row from this data but it returns nothing at all.

Table 1

field1    field2    field3
1         2         <NULL>

Table 2

field1    field2    field3
1         2         'some text data' 

There is workaround to fix such thing by using the NVL function: NVL(field3, 'dummytextorwhatever') but I dont want to include such horrible thing into my code. Any idea to solve this problem with nullable fields?

Thank you!

APC
  • 144,005
  • 19
  • 170
  • 281
Carlos
  • 456
  • 2
  • 7
  • 21
  • Can you please post, what you are expecting in the output ? – Shushil Bohara May 02 '17 at 09:43
  • 2
    It is not an (only) Oracle thing, when writing `WHERE value NOT IN (x, y, z)` this will be internally interpreted as `WHERE value != x AND value != y AND value != z`, and comparing against `NULL` (either for equality or unequality) always yields `FALSE` – DrCopyPaste May 02 '17 at 09:49
  • 1
    @DrCopyPaste - given your name I hope you won't mind if I copy your comment and paste it in my answer? :-D – APC May 02 '17 at 09:59
  • 1
    @APC yes, of course :) – DrCopyPaste May 02 '17 at 11:32

4 Answers4

4

This is known behavior with NOT IN when there are nulls in either the main table or the sub-query's result sets. As @DrCopyPaste puts it so well

"when writing WHERE value NOT IN (x, y, z) this will be internally interpreted as WHERE value != x AND value != y AND value != z, and comparing against NULL (either for equality or unequality) always yields FALSE"

The simple answer is to use NOT EXISTS:

Select 
   t1.field1, t1.field2, t1.field3)  
From
   table1 t1
Where  not exists   
   (select  null   from table2 t2
    where t2.field1 = t1.field1
    and t2.field2 = t1.field2
    and t2.field3 = t1.field3 )

An anti-join will produce the same result

Select 
   t1.field1, t1.field2, t1.field3)  
From
   table1 t1
     left join table2 t2
       on t2.field1 = t1.field1
       and t2.field2 = t1.field2
       and t2.field3 = t1.field3 
where t2.field1 is null

"why do you select a null at the beginning?"

Because with NOT EXISTS it doesn't matter what the sub-query returns. All that matters is that it returns a non-empty result set. It could have been 1 or field1 but it really doesn't matter, so why not null?

APC
  • 144,005
  • 19
  • 170
  • 281
  • why do you select a null at the beginning9 --> "select null from table2 t2" – Carlos May 02 '17 at 10:03
  • I tried your approach and returns all the rows. It must return just the rows which are different on any field. Right? – Carlos May 02 '17 at 10:08
  • Which approach? Both should return the subset of `table1` which don't have matches in `table2`. If that's all the rows in `table1` then that's what they return. – APC May 02 '17 at 10:17
  • The "NOT EXISTS" approach. It retrieves ALL the rows in table1 and not only" the subset of t1 which does not have matches in t2" – Carlos May 02 '17 at 10:26
  • Please **edit your question** to post some test data which reproduces this behavior. Also the code which you're running. presumably your actual SQL is different from the code you have posted here, so please make sure you're correctly translating my code example to run on your real tables. – APC May 02 '17 at 10:30
  • 3
    @Carlos - APC didn't just invent this technique right now, to solve your problem; what he offered is absolutely standard SQL (not even specific to Oracle). If you think it is not doing the right thing on your data, it is 100% certain that you are doing something wrong. Try to figure out what that is; the solution is correct. –  May 02 '17 at 11:24
  • I believe I did not explain properly myself: The idea is to retrieve rows from t1 which are not present in t2 (the comparison is applied to ALL the fields, which one of them is nullable). If I apply the NOT EXISTS pattern, the query retrieves rows that are identical between tables. I posted the only row which is different between tables, the other rows are exactly the same content – Carlos May 02 '17 at 12:09
  • 1
    Thank you everybody for your useful comments and suggestions, really appreciate it – Carlos May 03 '17 at 10:47
1

Try not exists

Select 
  t1.field1, 
  t1.field2, 
  t1.field3
From
  table1 t1
where not exists 
  (select 1
    from table2 t2
  where 
  t1.field1=t2.field1 
  and t1.field2=t2.field2 
  and t1.field3=t2.field3
  )

Sample test

with table1(field1,field2,field3) as
(select 1,2,null from dual),
        table2(field1,field2,field3) as
(select 1,2,'something' from dual)

Select 
  t1.field1, 
  t1.field2, 
  t1.field3
From
  table1 t1
where not exists 
  (select 1
    from table2 t2
  where 
  t1.field1=t2.field1 
  and t1.field2=t2.field2 
  and t1.field3=t2.field3
  )

Output

FIELD1 FIELD2 FIELD3
1      2      
Utsav
  • 7,914
  • 2
  • 17
  • 38
1

Try to use NVL or Coalesce operators, like this

Select 
   t1.field1, t1.field2, t1.field3 
From
   table1 t1
Where  (nvl(field1,0),nvl(field2,0),nvl(field3,0))
   not in 
   (select nvl(field1,0),nvl(field2,0),nvl(field3,0)
    from table2 t2)

but if in tables data there is some data equals 0 select will be return that row, because nvl(field1,0)=nvl(field2,0) when field1=0 and field2=null, so you can use any value(you should be confident ) wich no exists in your tables data for example -99(nvl(field,-99))

or you can use exists/not exists

Vecchiasignora
  • 1,275
  • 7
  • 6
  • That is the point OP is trying to make. You cannot 100% guarantee that some value doesn't exists in your data. We used to use `-99999` but still you `cannot` say that there is no remote chance that this value is not used in actual data. – Utsav May 02 '17 at 09:50
1

Based on your query, you are trying to find all the times in table1 that do not exist in table2. Instead of NOT IN, consider using a MINUS...

Select t1.field1, t1.field2, t1.field3
From   table1 t1
Minus
select t2.field1, t2.field2, t2.field3
from   table2 t2; 
unleashed
  • 771
  • 3
  • 9
  • The minus itself works like a charm: The problem comes when you want to delete such delta (t1 minus t2) from t1: The "NOT IN" does not behave properly in a delete where when nulls are found-->unknown for oracle. Thank you for the proposal :) – Carlos May 02 '17 at 14:52
  • So, you are trying to delete table1 with the result from this minus? Then just delete based on the results of the minus. This is cleaner than the NOT IN approach. Perhaps you should focus your question on what you are actually trying to accomplish instead of focusing in on the behavior of NOT IN with nulls. – unleashed May 02 '17 at 15:05
  • That's exactly the problem: When doing a minus to delete, the minus does not recognize the NULL values (result is UNKNOWN) then the minus is not a valid approach for me either. The minus itself retrieves properly the delta, but when you incorporate the minus to a delete, the delete does nothing for the affected rows in the minus, because of the NULL fields. In order to have it working, NVL is required, which I'm trying to avoid and the reason for I'm asking an alternative – Carlos May 02 '17 at 17:09
  • I've done this often, but one of the fields in the minus is usually the PK for the table. So, you could then wrap the minus with a select for the PK value, then use that as the source of the delete. Other than that, I don't know of a solution that would not involve NVL's. I'm not sure why it is such a horrible thing. – unleashed May 02 '17 at 20:16
  • It's because on the one hand is not accurate and on the other hand the code for a table with many fields like 20 (already ugly too) when comparing all these fields is not even readable without a proper headache beforehand. You don't know what is coming on such nullable fields, and the pattern that you use in the NVL for replacement might be a "real" value used indeed somewhere else from the data to compare to, with the consequent mistake and misleading data inserted/updated. – Carlos May 03 '17 at 10:46
  • Just because it doesn't work how you want it to work does not mean it is wrong. IN and NOT IN lists work on equality, and nothing is equal to null, not even null, because null is undefined. Thus, your results are entirely accurate. Also, for what you are doing, NOT IN, is really just not the right solution. Besides my MINUS, you have several other solutions which would be more accurate including NOT EXISTS and anti-joins. – unleashed May 03 '17 at 12:42