The following regexp_extract function appears to work in Impala, but does not work when I use it in Hive:
select regexp_extract("efwe FR wefwef", '.*?([[:upper:]]+).*?', 1)
The result in Impala is FR
(as I would expect, i.e. the upper case characters from the first group)
The result in Hive is e
(not what I would expect)
Can anyone explain why this is?
From researching this issue I have read that converting the regular expression to java style regex may help (http://www.regexplanet.com/advanced/java/index.html). But as far I know a Java Style Regex is the same as what I have.