0

I need to create external table for a hdfs location. The data is having null instead of empty space for few fields. If the field length is less than 4 for such fields, it is throwing error when selecting data. Is there a way to define replacement of all such nulls with empty space while creating table it self.?

I am trying it in greenplum, just tagged hive to see what can be done for such cases in hive.

Srini
  • 3,334
  • 6
  • 29
  • 64
  • Can you print the error traces? Sorry, I cannot understand this statement: "if the field length is less than 4 for such fields, it is throwing error...". Can you try to elaborate a bit more, please? – frb Apr 23 '15 at 06:24
  • I have data in file which is created by java and each column data is seperated by pipe symbol. For the columns with no data, i have placed null in the file. Now, using this file, i have created external tale. If the field length is less than 4 characters, as the file has null instead of empty string, it is thrown error as ERROR: value too long for type character(1) – Srini Apr 23 '15 at 07:15

2 Answers2

0

You could use the serialization property for mapping NULL string to empty string.

  CREATE TABLE IF NOT EXISTS abc ( ) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' STORED AS TEXTFILE TBLPROPERTIES ("serialization.null.format"="")

In this case when you query it from hive you would get empty value for that field and hdfs would have "\N".

Or

If you want to represented empty string instead of '\N', you can using COALESCE function:

 INSERT OVERWRITE tabname SELECT NULL, COALESCE(NULL,"") FROM data_table;
K S Nidhin
  • 2,622
  • 2
  • 22
  • 44
0

the answer to the problem is using NULL as 'null' statement in create table syntax for greenplum. As i have mentioned, i wanted to get few inputs from people who faced such issues in hive. so i have tagged hive as well. But, greenplum external table syntax supports NULL AS phrase in which we can specify the form of NULL that you want to keep.

Srini
  • 3,334
  • 6
  • 29
  • 64