I am trying to create a hive table with partition by a single field. The data that i wanted to process is log data. Format of log is:
DATE TIME IPAddress HTTP_METHOD MESSAGE
Create table hive query:
CREATE EXTERNAL TABLE test_Part(
logdate string,
logtime string,
ip string,
message string)
PARTITIONED BY(method string)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.RegexSerDe'
WITH SERDEPROPERTIES (
"input.regex" = "(\\d{4}-\\d{2}-\\d{2})\\s(\\d{2}:\\d{2}:\\d{2})\\s(\\d+\\.\\d+\\.\\d+\\.\\d+)\\s(\\S+)\\s(.*$)",
"output.format.string" = "%1$s %2$s %3$s %5$s %4$s"
)
STORED AS TEXTFILE;
And load script:
LOAD DATA LOCAL INPATH '/home/user/tools/apache-hive-1.2.2-bin/scripts/sample1.log' OVERWRITE INTO TABLE test_Part PARTITION(method='GET');
When i run a select query on the above table, it gives me error message as
Failed with exception java.io.IOException:org.apache.hadoop.hive.serde2.SerDeException: Number of matching groups doesn't match the number of columns
What am i missing ?