1

I have a table for which I need to know what data has updated from previous day to today. Assuming previous days data is in t1 and current day data in t2, I tried to do "t1 FULL OUTER JOIN t2" on all columns (because I need to get updates on whole table irrespective of any column) where each column of t2 is NULL.

The problem is if a row in t1 has null value in any of its columns and even though the rows remains same in t2, it is resulting in output, which I don't want. Here is an example of my situation.

create table t1(Host varchar(20), location varchar(20), OS varchar(20))
create table t2(Host varchar(20), location varchar(20), OS varchar(20))

  insert into t1 (Host,location,OS)
    values ('Host1','Location1','Linux'),
            ('Host2','Location2','Unix'),
            ('Host3','Location3','Solaris'),
            ('Host4','Location4','Windows'),
            ('Host5',null,'linux') ---> Host5 remains same in both tables

  insert into t2 (Host,location,OS)
    values ('Host1','Location1','Linux'),
            ('Host2','Location2','Unix'),
            ('Host3','Location3','Windows'),
            ('Host4','Location7','Windows'),
            ('Host5',null,'linux') ---> Host5 remains same in both tables

Query:

  SELECT distinct t1.Host, t1.location, t1.OS
  FROM t1 FULL OUTER JOIN
  t2 ON t1.Host = t2.Host 
         AND t1.location = t2.location 
         AND t1.OS = t2.OS
  WHERE (t2.Host IS NULL) OR
        (t2.location IS NULL) OR
        (t2.OS IS NULL)

The output was :

Host    location    OS
---------------------------
NULL    NULL        NULL
Host3   Location3   Solaris
Host4   Location4   Windows
Host5   NULL        linux 

In the expected result Host5 need to be removed as the same row exists in both the table.

I understand this is due to the nature of FULL OUTER JOIN with the WHERE condition and NULL value in data. Just want to know if there is any alternative to get only the updated records.

James Z
  • 12,209
  • 10
  • 24
  • 44
Ramu
  • 343
  • 1
  • 6
  • 21

2 Answers2

1

Have a look at EXCEPT and see if it fits your needs

SELECT *
FROM t1
EXCEPT 
SELECT *
FROM t2  

Output:

Host    location    OS 
--------------------------------
Host3   Location3   Solaris 
Host4   Location4   Windows
CaseyR
  • 437
  • 5
  • 13
  • this works perfectly, but just want to test for other use cases. Thanks for your time – Ramu Dec 29 '16 at 20:20
0
SELECT distinct t1.Host, t1.location, t1.OS
FROM t1
LEFT JOIN t2 ON 
   t1.Host = 2.Host AND
   COALESCE(t1.location,'<<null>>') = COALESCE(t2.location,'<<null>>') AND 
   t1.OS = t2.OS
WHERE COALESCE(t2.Host,t2.location,t2.OS) is null

Will give you t1 not in t2

SELECT distinct t2.Host, t2.location, t2.OS
FROM t2
LEFT JOIN t1 ON t1.Host = t2.Host AND t1.location = t2.location AND t1.OS = t2.OS
WHERE COALESCE(t1.Host,t1.location,t1.OS) is null

Will give you t2 not in t1

Hogan
  • 69,564
  • 10
  • 76
  • 117
  • Thanks for your time. When i executed the first query i got the error - An expression of non-boolean type specified in a context where a condition is expected, near ','. Not sure if there is anything wrong at my end. – Ramu Dec 29 '16 at 20:01
  • @Hogan it does not. – Tab Alleman Dec 29 '16 at 20:06
  • @TabAlleman -- they should fix that -- so much nicer... spoiled by DB2 – Hogan Dec 29 '16 at 20:07
  • @Hogan I still get Host5, which is exists exactly the same in both the tables :( – Ramu Dec 29 '16 at 20:16
  • @Hogan -- The entire row is same, which should be eliminated in result. – Ramu Dec 29 '16 at 20:26
  • @Hogan -- it's working now. I just included COALESCE(Column,'<>') for all the column in JOIN. Would run the query for my test cases and update you. – Ramu Dec 29 '16 at 20:32
  • 1
    don't use that for a non string column :D – Hogan Dec 29 '16 at 20:33
  • The NULL elimination is a great trick but it doesn't work when you have a numeric column and any possible value can be represented by a business case. E.g. if you had smallint and the column couldn't be negative then you could use - 1 as the null elimination value but if negative values are allowed as well what would you use as the null elimination value? – Matt Dec 29 '16 at 21:00
  • @Matt -- what would I use? I guess I would have to use `OR` and check for both values equal to null. Be a lot slower. – Hogan Dec 30 '16 at 03:01
  • Hogan that's my question too :) you would probably have to test for null and compare with or etc. I was just pointing out the one major limitation to the null elimination method. I think it's a great tip and fantastic method that I use a lot as well just got the one flaw when there is no value you can put in that cannot be represented in business cases. As far as numeric you could try using a bigint value if column is int, smallint etc but that leads to other issues too. The except works nicely in this particular case but again the null elimination is a great tip! – Matt Dec 30 '16 at 05:32