0

I am trying to import a file into hive. The sample data is as following

1::Toy Story (1995)::Animation|Children's|Comedy 
2::Jumanji (1995)::Adventure|Children's|Fantasy
3::Grumpier Old Men (1995)::Comedy|Romance
4::Waiting to Exhale (1995)::Comedy|Drama

My table declaration is as following

create table movies(id int,title string,genre string) row format delimited fields terminated by '::'; 

But after loading the data, my table shows data for the first two fields only.

Total MapReduce CPU Time Spent: 1 seconds 600 msec
OK
1       Toy Story (1995)    
2       Jumanji (1995)  
3       Grumpier Old Men (1995) 
4       Waiting to Exhale (1995)    
Time taken: 22.087 seconds

Can anyone help me on why this is happening or how to debug this.

Snehansu
  • 13
  • 3

1 Answers1

0

Hive row delimiter will take only one character by default, since you have two characters '::' Please try Creating Table with MultiDelimitSerDe

Query:

CREATE TABLE movies (id int,title string,genre string) ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' 
WITH SERDEPROPERTIES ("field.delim"="::")
STORED AS TEXTFILE;

Output:

hive> select * from movies;
OK
1   Toy Story (1995)    Animation|Children's|Comedy 
2   Jumanji (1995)  Adventure|Children's|Fantasy
3   Grumpier Old Men (1995) Comedy|Romance
4   Waiting to Exhale (1995)    Comedy|Drama

Please refer similar post: Load data into Hive with custom delimiter

Community
  • 1
  • 1
Aditya
  • 2,385
  • 18
  • 25