0

I am trying to do a left join between two datasets using a CASE WHEN in databricks statement with the following code:

;WITH CTE1 AS
 (
  SELECT *,ROW_NUMBER()OVER(ORDER BY ts_primarysecondaryfocus)RowNum FROM dataverse.accountv41
 ),CTE2 AS
 (
  SELECT *,ROW_NUMBER()OVER(ORDER BY ts_primarysecondaryfocus)RowNum FROM dataverse.optionsetmetadatav3
 )
 SELECT C1.SinkCreatedOn,C1.SinkModifiedOn,C1.statecode,C1.statuscode
       ,CASE WHEN C1.ts_primarysecondaryfocus <> IFNULL(C2.ts_primarysecondaryfocus,'') THEN C2.ts_primarysecondaryfocus ELSE C1.ts_primarysecondaryfocus END AS ts_primarysecondaryfocus
 FROM CTE1 C1 LEFT JOIN CTE2 C2 ON C1.RowNum = C2.RowNum

The expected output should be: +-----------------+-----------------+-----------+------------+--------------------------+ | SinkCreatedOn | SinkModifiedOn | statecode | statuscode | ts_primarysecondaryfocus | +-----------------+-----------------+-----------+------------+--------------------------+ | 5/19/2022 16:25 | 5/19/2022 16:25 | 0 | 1 | | +-----------------+-----------------+-----------+------------+--------------------------+ | 5/19/2022 16:19 | 5/19/2022 16:19 | 0 | 1 | | +-----------------+-----------------+-----------+------------+--------------------------+ | 5/19/2022 16:19 | 5/19/2022 16:19 | 0 | 1 | | +-----------------+-----------------+-----------+------------+--------------------------+ | 5/19/2022 16:25 | 5/19/2022 16:25 | 0 | 1 | | +-----------------+-----------------+-----------+------------+--------------------------+ | 5/19/2022 16:19 | 5/19/2022 16:19 | 0 | 1 | | +-----------------+-----------------+-----------+------------+--------------------------+ | 5/19/2022 16:19 | 5/19/2022 16:19 | 0 | 1 | | +-----------------+-----------------+-----------+------------+--------------------------+ | 5/19/2022 16:19 | 5/19/2022 16:19 | 0 | 1 | | +-----------------+-----------------+-----------+------------+--------------------------+ | 5/19/2022 16:19 | 5/19/2022 16:19 | 0 | 1 | | +-----------------+-----------------+-----------+------------+--------------------------+ | 5/19/2022 16:19 | 5/19/2022 16:19 | 0 | 1 | donald | +-----------------+-----------------+-----------+------------+--------------------------+ | 5/19/2022 16:19 | 5/19/2022 16:19 | 0 | 1 | TBC | +-----------------+-----------------+-----------+------------+--------------------------+ | 5/19/2022 16:19 | 5/19/2022 16:19 | 0 | 1 | Tier 1 | +-----------------+-----------------+-----------+------------+--------------------------+ | 5/19/2022 16:19 | 5/19/2022 16:19 | 0 | 1 | Tier 2 | +-----------------+-----------------+-----------+------------+--------------------------+ | 5/19/2022 16:19 | 5/19/2022 16:19 | 0 | 1 | | +-----------------+-----------------+-----------+------------+--------------------------+

However, I'm getting the following output

+-----------------+-----------------+-----------+------------+--------------------------+ | SinkCreatedOn | SinkModifiedOn | statecode | statuscode | ts_primarysecondaryfocus | +-----------------+-----------------+-----------+------------+--------------------------+ | 5/19/2022 16:25 | 5/19/2022 16:25 | 0 | 1 | | +-----------------+-----------------+-----------+------------+--------------------------+ | 5/19/2022 16:19 | 5/19/2022 16:19 | 0 | 1 | | +-----------------+-----------------+-----------+------------+--------------------------+ | 5/19/2022 16:19 | 5/19/2022 16:19 | 0 | 1 | | +-----------------+-----------------+-----------+------------+--------------------------+ | 5/19/2022 16:25 | 5/19/2022 16:25 | 0 | 1 | | +-----------------+-----------------+-----------+------------+--------------------------+ | 5/19/2022 16:19 | 5/19/2022 16:19 | 0 | 1 | | +-----------------+-----------------+-----------+------------+--------------------------+ | 5/19/2022 16:19 | 5/19/2022 16:19 | 0 | 1 | | +-----------------+-----------------+-----------+------------+--------------------------+ | 5/19/2022 16:19 | 5/19/2022 16:19 | 0 | 1 | | +-----------------+-----------------+-----------+------------+--------------------------+ | 5/19/2022 16:19 | 5/19/2022 16:19 | 0 | 1 | | +-----------------+-----------------+-----------+------------+--------------------------+ | 5/19/2022 16:19 | 5/19/2022 16:19 | 0 | 1 | | +-----------------+-----------------+-----------+------------+--------------------------+ | 5/19/2022 16:19 | 5/19/2022 16:19 | 0 | 1 | | +-----------------+-----------------+-----------+------------+--------------------------+ | 5/19/2022 16:19 | 5/19/2022 16:19 | 0 | 1 | | +-----------------+-----------------+-----------+------------+--------------------------+ | 5/19/2022 16:19 | 5/19/2022 16:19 | 0 | 1 | | +-----------------+-----------------+-----------+------------+--------------------------+ | 5/19/2022 16:19 | 5/19/2022 16:19 | 0 | 1 | | +-----------------+-----------------+-----------+------------+--------------------------+

Sample Data accountv41

+-----------------+-----------------+-----------+------------+--------------------------+ | SinkCreatedOn | SinkModifiedOn | statecode | statuscode | ts_primarysecondaryfocus | +-----------------+-----------------+-----------+------------+--------------------------+ | 5/19/2022 16:25 | 5/19/2022 16:25 | 0 | 1 | | +-----------------+-----------------+-----------+------------+--------------------------+ | 5/19/2022 16:19 | 5/19/2022 16:19 | 0 | 1 | | +-----------------+-----------------+-----------+------------+--------------------------+ | 5/19/2022 16:15 | 5/19/2022 16:15 | 0 | 1 | | +-----------------+-----------------+-----------+------------+--------------------------+ | 5/19/2022 16:19 | 5/19/2022 16:19 | 0 | 1 | | +-----------------+-----------------+-----------+------------+--------------------------+ | 5/19/2022 16:15 | 5/19/2022 16:15 | 0 | 1 | | +-----------------+-----------------+-----------+------------+--------------------------+

Sample Data optionsetmetadatav3

+-----------------------+------------------+-----------------+--------------+--------------------------+ | participationtypemask | instancetypecode | donotpostalmail | donotfax | ts_primarysecondaryfocus | +-----------------------+------------------+-----------------+--------------+--------------------------+ | Sender | Not Recurring | Allow | Allow | Tier 1 | +-----------------------+------------------+-----------------+--------------+--------------------------+ | To Recipient | Not Recurring | Allow | Allow | Tier 2 | +-----------------------+------------------+-----------------+--------------+--------------------------+ | CC Recipient | Not Recurring | Allow | Allow | TBC | +-----------------------+------------------+-----------------+--------------+--------------------------+ | BCC Recipient | Recurring Master | Do Not Allow | Do Not Allow | donald | +-----------------------+------------------+-----------------+--------------+--------------------------+ | Required attendee | Recurring Master | Do Not Allow | Do Not Allow | | +-----------------------+------------------+-----------------+--------------+--------------------------+

Any thoughts on why the Left Join won't work?

Patterson
  • 1,927
  • 1
  • 19
  • 56

1 Answers1

1

The problem is because of the misinterpretation of the table values. When I tried reproducing this query, it gave the correct output when empty values in ts_primarysecondaryfocus column in both tables are empty string ''. It fails when these values are null instead.

When the empty values of ts_primarysecondaryfocus are '':

  • accountv41 table data:

enter image description here

  • optionsetmetadatav3 table data:

enter image description here

  • Query output (the query you used):

enter image description here

When the empty values of ts_primarysecondaryfocus are null:

  • accountv41 table data:

enter image description here

  • optionsetmetadatav3 table data:

enter image description here

  • Query output (Your Query):

enter image description here

  • Modified query output:

enter image description here

I modified the CASE condition so that it would work irrespective of the empty values (empty string or null) present in ts_primarysecondaryfocus. The following query should work without any issue.

WITH CTE1 AS
 (
  SELECT *,ROW_NUMBER()OVER(ORDER BY ts_primarysecondaryfocus)RowNum FROM accountv41
 ),CTE2 AS
 (
  SELECT *,ROW_NUMBER()OVER(ORDER BY ts_primarysecondaryfocus)RowNum FROM optionsetmetadatav3
 )
 SELECT C1.SinkCreatedOn,C1.SinkModifiedOn,C1.statecode,C1.statuscode
       ,CASE WHEN IFNULL(C1.ts_primarysecondaryfocus,'') <> IFNULL(C2.ts_primarysecondaryfocus,'') THEN C2.ts_primarysecondaryfocus ELSE C1.ts_primarysecondaryfocus END AS ts_primarysecondaryfocus
 FROM CTE1 C1 LEFT JOIN CTE2 C2 ON C1.RowNum = C2.RowNum
Saideep Arikontham
  • 5,558
  • 2
  • 3
  • 11
  • OMG! This is truly amazing. I won’t lie I haven’t thoroughly tested this yet, but looking at the screenshots you have provided it clearly looks like you have resolved the issue. It’s very late at the moment so I will test this in the morning In the meantime , I can’t thank you enough. This resolution has literally kept me in a job. Thanks Saideep – Patterson Jul 09 '22 at 23:15