I am trying to create a table (CTAS) and I want it to have a multi-char delimiter to it like let’s say “#$“ or “^|^“.
Query:
CREATE TABLE IF NOT EXISTS <table_name>
ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe'
WITH SERDEPROPERTIES ("field.delim"="@#")
STORED AS TEXTFILE
AS
SELECT <columns> from <table> ;
While running it, is throwing NullPointerException:
ERROR : Job failed with java.lang.NullPointerException
java.util.concurrent.ExecutionException: Exception thrown by job
at org.apache.spark.JavaFutureActionWrapper.getImpl(FutureAction.scala:337)
at org.apache.spark.JavaFutureActionWrapper.get(FutureAction.scala:342)
at org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:362)
at org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:323)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 1.0 failed 4 times, most recent failure: Lost task 0.3 in stage 1.0 (TID 4, agent3659-phx3.prod.uber.internal, executor 1): java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Error while processing row (tag=0) {"key":{},"value":{"_col0":"<column_value>"}}
I have also tried RegexSerDe, same error.
ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe'
WITH SERDEPROPERTIES ("input.regex" = "^(\\d+)~\\*(.*)$")
Will be great to get your inputs on this.