I try to create table from CSV file which is save into HDFS. The problem is that the csv consist line break inside of quote. Example of record in CSV:
ID,PR_ID,SUMMARY
2063,1184,"This is problem field because consists line break
This is not new record but it is part of text of third column
"
I created hive table:
CREATE TEMPORARY EXTERNAL TABLE hive_database.hive_table
(
ID STRING,
PR_ID STRING,
SUMMARY STRING
)
row format serde 'com.bizo.hive.serde.csv.CSVSerde'
with serdeproperties (
"separatorChar" = ",",
"quoteChar" = "\"",
"escapeChar" = "\""
)
stored as textfile
LOCATION '/path/to/hdfs/dir/csv'
tblproperties('skip.header.line.count'='1');
Then I try to count the rows (The correct result should by 1)
Select count(*) from hive_database.hive_table;
But the result is 4 what is incorrect. Do you have any idea how to solve it? Thanks all.