4

I'm using static partition in hive to seggregate the data into subdirectories based on date field, I'll need 365 partitions/year for each table(total 14 tables) as I have daily loads into hive.

Is there any limitation on number of static partitions that can be created in hive?

Dynamic partition gives error if "hive.exec.max.dynamic.partitions.pernode" exceeds the specified thresold(100) in sqoop import

I have 5 node HDP cluster out of which 3 are datanodes

Will it hamper performace of cluster if I increase the number of partitions that can be created in hive ?

Is that limitation only for dynamic partition or it is applicable to static as well?

Reference

Check trrouble shooting and best practices section https://cwiki.apache.org/confluence/display/Hive/Tutorial

Kindly suggest

Chhaya Vishwakarma
  • 1,407
  • 9
  • 44
  • 72

1 Answers1

1

For partitioning on date field, the best approach is to partition based on year/month/day.

That said, based on your requirement you should choose your partition strategy. There is no limitation on number of partitions as such unless and until you are over partitioning. which means unnecessarily creating too many partitions and each partition storing very small amount of data.

Regarding the error, you can fix it by increasing the number. You can set hive.exec.max.dynamic.partitions.pernode in hive.

Hope this helps.

sunil
  • 1,259
  • 1
  • 14
  • 27
  • I can increase the partition count but is it advisable for 3 node cluster thats what I'm trying to figure out...I will have 365 partition for each table/year and i need to store 7 years data – Chhaya Vishwakarma Mar 18 '15 at 07:20
  • I think that should be fine. as long as you have enough data for each partition. – sunil Mar 18 '15 at 11:43