I have a data set as following. To keep it simple I am showing only one column i.e qty but there are multiples for example (dates, prices etc). if any of the value within a single row is a mismatch then that row should show up as output along with the values from both sources, for example:
source 1
id qty
1 100
2 null
3 0
4 50
source 2
id_ qty_
1 100
2 80
3 100
expected output:
id qty id_ qty_
2 null 2 80
3 0 3 100
4 50 null null
What i am trying to achieve is to fetch all the rows from source 2 that doesn't match with the source 1. Source 1 is taken as main source/table.
In the end the output should show only row 2,3,4 with both values so that i can build a boolean saying:
if (qty != qty_ , false, true) as is_qty_matching
An example with multiple values:
source 1
id qty price
1 100 10
2 null 0
3 0 0
4 50 0
5 100 30
source 2
id_ qty_ price
1 100 10
2 80 0
3 100 0
5 100 33
expected output:
id qty price id_ qty_ price_
2 null 0 2 80 0
3 0 0 3 100 0
4 50 0 null null 0
5 100 30 5 100 33
In this case now id 5 has price mismatch
so my two new columns (is_price_matching) would show false for id 5 while true for other id', while the column (is_qty_matching) will show true for id 5 and false for others.
Similarly i will have multiple other columns and therefore multiple other boolean fields for each of the mismatch entry for example is_date_matching, is_time_matching etc.
what's important is that:
- The values can be null in either of source for any of the column, this might mess up the boolean maybe
- only the rows that has even one single mismatch value should show up in the final output, if there is no mismatch in any of the column values it should not show up
I have tried the following query:
with main as (
select 1 as id , 100 as qty , 50 as price
union all
select 2 as id , 0 as qty , 100 as price
union all
select 3 as id , 0 as qty ,80 as price
union all
select 4 as id , 50 as qty , 90 as price
union all
select 5 as id , 20 as qty , 100 as price
union all
select 6 as id , 20 as qty , 100 as price
),
main2 as (
select 1 as id_, 100 as qty_ , 50 as price_
union all
select 2 as id_, 80 as qty_ , 100 as price_
union all
select 3 as id_, 100 as qty_ , 80 as price_
union all
select 5 as id_, 20 as qty_ , 100 as price_
union all
select 6 as id_, 20 as qty_ , 40 as price_
)
select
main.*,
main2.*
from main
left join main2
on (main.id = main2.id_)
WHERE coalesce(qty,0) != coalesce(qty_,0)
or coalesce(main.price,0) != coalesce(main2.price_,0)
it seems to work but i was wondering if there is a better solution ? plus if i add the following line of code
if(qty != qty_ , true , false) as is_qty_mismatch
it will return incorrect output where the values are null