I have a similiar question to here: Hive table source delimited by multiple spaces
My data looks like this:
AL, 01, 2016010700, , BEST, 0, 266N, 753W
AL, 01, 2016010706, , BEST, 0, 276N, 747W
AL, 01, 2016010712, , BEST, 0, 287N, 738W
AL, 01, 2016010712, , BEST, 0, 287N, 738W
That means my column delimiter is "a comma plus a variable number of spaces".
I tried to simply modify field.delim
by adding this comma to the regex, but it doesn't work.
The result is, that all data gets put into the first column (basin
) and all other columns are NULL.
CREATE EXTERNAL TABLE IF NOT EXISTS default.myTable1
(
basin string
,cy string
,yyyymmddhh int
,technum_min string
,tech string
,tau string
,lat_n_s string
,lon_e_w string
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe'
WITH SERDEPROPERTIES ("field.delim"=",\\s+")
LOCATION '/data';
I am running HDP 2.5 (Hive 1.2.1).
Thanks for any help and suggestions.