Insert data into avro-formatted, partitioned hive table with data from HDFS

Question

I have created a hive table named employee (avro formatted) with partition on department.

I have the avro dataset in my HDFS location. My dataset is also having department id.

I would like to import the data into Hive table with the data from HDFS. During the import, I want the data to be kept in its respective partition.

How to achieve this? any idea?

shankarsh15 · Accepted Answer · 2016-05-17T15:45:03.283

0

There are 2 ways of doing it.

1.Manual partitioning

load data inpath hdfs path into table employee_table partition(deptId='1')

load data inpath hdfs path into table employee_table partition(deptId='2')

2.Dynamic partitioning

a. Create a intermediate table

b. Create a employee table with partition

c. Load data from intermediate table to partition table

edited May 17 '16 at 15:45

answered May 17 '16 at 15:33

shankarsh15

Thanks @shankarsh15. By using Manual Partitioning, data in HDFS must be organized according to the target partition I wish to create isn't it? Otherwise, all the data in the HDFS path may be stored in one partition.. am I right? – Sivakumar May 18 '16 at 06:35

1 Answers1