-1

I have table in Teradata SQL like below:

col1        col2 
Jan Kowal  | Jan Kowal PHU
Bravo JohN | GB John BraVo
Ann Still  | Ronald G

And I need to select only these rows from above table where:

  1. In col2 is value from col1 + something more (nevermind before or after)
  2. Be aware that Jan Kowal and Kowal Jan is name and surname and it could be commutative
  3. Case = False (nevermind whether letter are high or small)

So I need to select only first and second row:

col1          col2
Jan Kowal  | Jan Kowal PHU
Bravo JoHN  | GB John BraVo  
  • Because both rows have value from col1 + something more (before or after) in col2
  • Case of letter = False
  • name and surname and it could be commutative
dingaro
  • 2,156
  • 9
  • 29

1 Answers1

0

You describe the first column as having a "name" and "surname" meaning that it has exactly two components.

For this problem, strtok() can help:

where regexp_instr(col2, strtok(col1, ' ', 1), 'i') and
      regexp_instr(col2, strtok(col1, ' ', 2), 'i')

I've used regular expressions here because it is simpler to modify it you want to consider word boundaries. Your question is not clear on whether that is necessary.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • Gordon first column not always have 2 words, could you modify your answer ? – dingaro Jul 10 '21 at 11:43
  • @koler . . . I didn't make up the "2". The question explicitly refers to the two components as "name" and "surname", implying very strongly that there are two. If there is a known maximum, I would just continue using `regexp_similar()` to handle it. However, for a general solution, you should ask another question. – Gordon Linoff Jul 10 '21 at 12:25
  • Note that you probably want `regexp_instr` because `regexp_similar` requires a full-string match. – Fred Jul 10 '21 at 17:14