1

I am working within Snowflake and some of the IDs in a given table and column are delimited by semicolons. Despite this delimiter the tables should still be joined. Any attempt to join the table is usually met with an error of some sort.

Below I have an example of what I have attempted to do.

       table_A                       table_B
+----------+----------+      +----------+----------+ 
| some_id  |  F_Name  |      | some_id  |  L_Name  |
+----------+----------+      +----------+----------+
|   12345  |   John   |      |12345;4321|    Doe   |
+----------+----------+      +----------+----------+

Attempted SQL Statement

select table_A.some_id, table_A.F_name, table_B.L_Name
from table_A
left join table_B on table_A.some_id like '%'||table_B.some_id||'%'

The source for this particular problem has shown up here but it doesn't seem to work. It may just not be possible to do joins in this particular way. https://community.snowflake.com/s/question/0D50Z00008zPLLx/join-with-partial-string-match

Any help is most appreciated.

tisaconundrum
  • 2,156
  • 2
  • 22
  • 37
  • Ideally you shouldn't be looking to apply functions to join predicates in flight. When doing so Snowflake's cloud services (or even indexes on other platforms) is unable to prune the results to ensure the query performs optimally. A simple substring would do, but ideally the data should have the column already sub-stringed – patrick_at_snowflake Oct 26 '21 at 21:57

2 Answers2

1

Your approach is correct but your use of LIKE clause isn't. You are looking for table_A.some_id as a partial part of table_B.some_id

So your query should be :

WITH table_A AS (
  SELECT '12345' AS some_id, 'John' AS F_NAME
),
table_B AS (
  SELECT '12345;4321' AS some_id, 'Doe' AS L_NAME
)
SELECT 
    table_A.some_id, 
    table_A.F_name, 
    L_Name
FROM table_A
LEFT JOIN table_B 
    ON table_B.some_id LIKE '%'||table_A.some_id||'%'
;

enter image description here

CMe
  • 642
  • 3
  • 9
1

If you use a CTE to flatten the second table and then filter out duplicates

WITH expand AS (
    SELECT 
       f.seq as seq,
       f.index as index
       f.value as join_token
       t.l_name
    FROM table_b t,
      lateral split_to_table(t.some_id, ';') f
)
SELECT 
  a.some_id, 
  a.f_name, 
  f.l_name
FROM table_a AS a
LEFT JOIN expand AS f
    ON a.some_id = f.join_token
QUALIFY row_number() over (partition by f.seq ORDER BY f.index) == 1

Simeon Pilgrim
  • 22,906
  • 3
  • 32
  • 45