regex on hiveql : dangling meta character map reduce

Question

So all, I have been transform my data into hive with talend.

And I run a few of regex. One of those is like this.

KRW3TR.899877.GR0054656*DR.798012...2..............GR0054656*EUR*
KRW3TR.899877.GR0054656*DR.798012...2..............GR0054656*EUR*DDT*
KRW3TR.899877.GR0054656*DR.798012...2..............GR0054656*EUR*CCT*

What I am trying to do is get the last sequence: DDT CCT

(from those examples you know that the last sequence sometimes occur)

And I get the error from map reduce :

Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to execute method public java.lang.String org.apache.hadoop.hive.ql.udf.UDFRegExpExtract.evaluate(java.lang.String,java.lang.String,java.lang.Integer)  on object org.apache.hadoop.hive.ql.udf.UDFRegExpExtract@a22c4d8 of class org.apache.hadoop.hive.ql.udf.UDFRegExpExtract with arguments

And the other is :

Caused by: java.lang.reflect.InvocationTargetException

Caused by: java.util.regex.PatternSyntaxException: Dangling meta character '*' near index 9

I use this regex to extract this:

REGEXP_EXTRACT(columnrr,'^(?:[^*]*\\*){3}([^*]*)',1) as TYPE

My questions are : Are they related? Is there any business with the occurance of DDT and CCT? How my regex should be?

Thank you.

ah ya! thank you! so the answer is my question? should I delete this question? — thecardcaptor, Jan 20 '21 at 10:14

score 0 · Answer 1 · answered Jan 20 '21 at 10:17

0

I found it. There is reserved character in regex. So the answer is:

REGEXP_EXTRACT(columnrr,'^(?:[^*]*\\\\*){3}([^*]*)',1) as TYPE

Related questions : java.util.regex.PatternSyntaxException: Dangling meta character '+' near index 0 +

answered Jan 20 '21 at 10:17

thecardcaptor

131
7

regex on hiveql : dangling meta character map reduce

1 Answers1