Can someone please guide me how should I Load data in hive where I am getting " in some rows and in some rows data is coming without " for the same column value.
Sample Data:
id,name,desc,uqc,roll,age
1,Monali,"abhc,jkjk",,23,23
2,mj,nhiijkla,67,23,60
7,jena,"kdjuu,hsysi,juw",3,34,23
1,Monali,"/"coppers bars","rods and profiles"/",,23,23
2,money,"/"COUPLING","FLANGES & CROSS OVER"/",67,23,60
In above data for id '2' " are not there in desc column value.
My Create Statement:
create external table testing(id int,
name string,
desc string,
uqc double,
roll int,
age int
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.RegexSerDe'
WITH SERDEPROPERTIES ('input.regex'='^(\\d+?),(.*?),"(.*)",([0-9.]*),([0-9]*),([0-9]*).*')
location ....
TBLPROPERTIES("skip.header.line.count"="1")
;
While loading Data I'm not getting any error. But when I doing select * from testing.select statement is not Executing. Above Create and select statement working fine If data is coming with ", but is not working if data is coming with and without ".