1

I have created a hive table named employee (avro formatted) with partition on department.

I have the avro dataset in my HDFS location. My dataset is also having department id.

I would like to import the data into Hive table with the data from HDFS. During the import, I want the data to be kept in its respective partition.

How to achieve this? any idea?

Sivakumar
  • 344
  • 3
  • 8

1 Answers1

0

There are 2 ways of doing it.

1.Manual partitioning

load data inpath hdfs path into table employee_table partition(deptId='1')

load data inpath hdfs path into table employee_table partition(deptId='2')

2.Dynamic partitioning

a. Create a intermediate table

b. Create a employee table with partition

c. Load data from intermediate table to partition table

shankarsh15
  • 1,947
  • 1
  • 11
  • 16
  • Thanks @shankarsh15. By using Manual Partitioning, data in HDFS must be organized according to the target partition I wish to create isn't it? Otherwise, all the data in the HDFS path may be stored in one partition.. am I right? – Sivakumar May 18 '16 at 06:35