-3

I am trying to create a hive table with Serde Regex property to load the below file.

Input File:

$ hdfs dfs -cat /user/t04413b/test.log
{"repoType":3,"repo":"PROD_hive","reqUser":"shdingst","evtTime":"2020-06-09 01:01:23.308"}

Hive create table query:

create external table logs3
(
repo_type  string,
repo string,
requser string,
evttime string
)
row format serde 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe'
with serdeproperties (
"input.regex" = ":(.*),.*:(.*),.*:(.*),.*?:(.*)}.*"
)
stored as textfile;

load data inpath '/user/t04413b/test.log' into table logs3;

select * from logs3;
+------------------+-------------+----------------+----------------+--+
| logs3.repo_type  | logs3.repo  | logs3.requser  | logs3.evttime  |
+------------------+-------------+----------------+----------------+--+
| NULL             | NULL        | NULL           | NULL           |

I tested the regex in Rubular.com and it worked fine but in SerDe Regex its not working. Can someone please help to resolve it? Thanks

leftjoin
  • 36,950
  • 8
  • 57
  • 116
Ragul Cs
  • 11
  • 1
  • 1
    Welcome to stackoverflow. You can use backticks to format code like so: ``` code ``` . This makes the question more readable – PiRocks Jun 11 '20 at 19:07

1 Answers1

0

'}' is a special character in regex and needs shielding:

"input.regex" = "^.*:(.*),.*:(.*),.*:(.*),.*?:(.*)\\}.*"
leftjoin
  • 36,950
  • 8
  • 57
  • 116