2

So I did all the research and couldn't see the same issue anywhere in HIVE.

Followed the link below and I have no issues with data in quotes..

https://github.com/ogrodnek/csv-serde

My external table creation has the below serde properties,but for some reason,the default escapeChar('\') is being replaced by quoteChar which is doublequotes(") for my data.

CREATE EXTERNAL TABLE IF NOT EXISTS people_full(
`Unique ID` STRING,
.
.
.
.
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'  
WITH SERDEPROPERTIES (
   "escapeChar"    = "\\"
)  
STORED AS TEXTFILE

DATA ISSUE :

Sample HDFS Source data : "\"Robs business Target HIVE Output : """Robs business

So the three double quotes as seen in """Robs business after the replacement is causing the data unwanted data delimitation (column is a very long string) may be as HIVE cannot handle three double quotes inside data(quote(") is also my default quote character)?

Why is this happening and is there a solution ? Please help.Many thanks.

Best, Asha

E_net4
  • 27,810
  • 13
  • 101
  • 139

2 Answers2

2

To import your csv file to hdfs with double qoutes in between data and create hive table for that file, follow the query in hive to create external table which works fine and displays each record as of in the file.

create external table tablename (datatype colname,datatype2 colname2) row format
SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde' WITH SERDEPROPERTIES 
("separatorChar" = ",","quoteChar" = "\"") stored as textfile location '/dir_name/';

Here, the tablename represents the name of table, datatype is like string, int or maybe other and colname represents the name of the column you are going to give and finally dir_name is the location of csv or text file in hdfs location.

Arunachalam
  • 51
  • 1
  • 11
  • I am having double quotes in my Text - Column sample - "5f367c99e4b0799af79a2b89","The Washingtons: George and Martha, "Join'd by Friendship, Crown'd by Love"","" In here second column has couple of double quotes inside . Anything can be done to fix on that ? – Sam Berchmans Aug 21 '20 at 04:15
0

Try with the Escaped by it will work. Please find the below screenshot example. enter image description here

  • Hi @kundam,Sorry I edited my question to show right SQL.I was using "escapeChar" = "\\" then I have the mentioned data issue(in my post),so I changed it to "escapeChar" = "\'" to avoid the replacement and it works,but I don't want to change the default escape character which is backslash. –  Oct 06 '17 at 16:40
  • So my actual issue is ('\') being replace with default quote character,is this known issue any HIVE version ? Thanks for the help. –  Oct 06 '17 at 16:48
  • Please **[edit]** your post and show the actual code as text instead of screenshots. Others can't copy and paste from your images. [See here](https://meta.stackoverflow.com/a/285557/1402846) for details. Thank you. – Pang Oct 07 '17 at 03:36
  • @kundam : I am having double quotes in my Text - Column sample - "5f367c99e4b0799af79a2b89","The Washingtons: George and Martha, "Join'd by Friendship, Crown'd by Love"","" In here second column has couple of double quotes inside. Can the above fix also used for this , because my escape quotes is " Please do let me know if you have any fix for the same. – Sam Berchmans Aug 25 '20 at 22:24