1

For example:

df.select('category').show()

+---------------------------+
|                   category|
+---------------------------+
|            money,insurance|
|            life, housework|
|           game,FPS,network|
|            game,fight,jump|
|                      hotel|
|                 trip,hotel|
|                       null|

I want to use RLIKE to write a regex expression to fuzzy match one of substrings list, ['money', 'life'].

-- This is an exact match
SELECT * 
FROM tb_name
WHERE col_name RLIKE '(money|life)'

-- This is a fuzzy match
SELECT * 
FROM tb_name
WHERE col_name RLIKE '*.(money|life)'

BUT there is error in ast tree in the fuzzy match code snippet.

06-11 16:59:17-fatal filter ast tree

(TOK_QUERY (TOK_FROM (TOK_TABREF (TOK_TAB tb_name))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR "hdfs://XXXX/XX")) (TOK_SELECT (TOK_SELEXPR TOK_ALLCOLREF)) (TOK_WHERE (RLIKE (TOK_TABLE_OR_COL col_name ) '*.(money|life)')) (TOK_LIMIT 2000)))

06-11 16:59:17-fatal Filter feature: .TOK_TAB \S tdw_inter_db.*|.TOK_(CUBE|ROLLUP) .

So I can't see anything wrong with the fuzzy match code snippet.
So could anyone help me?
Thanks in advances.

Community
  • 1
  • 1
Bowen Peng
  • 1,635
  • 4
  • 21
  • 39
  • 1
    what fuzzy template should do? `RLIKE 'money|life'` will match strings containing any of money, life – leftjoin Jun 11 '20 at 09:48
  • @leftjoin I don't it is the intranet platform implementation problem here or anything else. If I try to `RLIKE 'money|life'` or `RLIKE '(money|life)', it only returns the value which is exactly equal to `money` or `life`. – Bowen Peng Jun 11 '20 at 12:06
  • Look at this: https://demo.gethue.com/hue/editor?editor=123435 `select 'life, housework' rlike ('money|life')` returns TRUE – leftjoin Jun 11 '20 at 12:16
  • @leftjoin Thanks sincerely. I am sure the intranet platform has some implementation problems on `RLIKE`. – Bowen Peng Jun 11 '20 at 12:36

1 Answers1

1

'(?i)money|life' regexp will match strings containing any of money, life, case insensitive - (?i)

leftjoin
  • 36,950
  • 8
  • 57
  • 116