2

I would like to know what is the best possible way(s) of adding partitions to the external table. I have a external table on S3 in hive with the partition as vehicle=/date=/hr=


Now new vehicle can be added at any time of day and there will be vehicles which will not have data for a couple of hours in a day or for couple of days.

Few possible solutions - msck reapir table : It takes a lot of time - Add partition via script : I may not know when new vehicle gets created or which hour data is not there for a vehicle

How do generally people solve this problem of adding partitions to the external tables

leftjoin
  • 36,950
  • 8
  • 57
  • 116
Nipun
  • 4,119
  • 5
  • 47
  • 83

1 Answers1

1

msck reapir table is a right way to do this. If it runs too slow, try to switch off stats autogather before repair table:

set hive.stats.autogather=false;

You can enable it again after recovering partitions.

Most probably you are hitting HIVE-18743 or related bug. In my case this helped.

leftjoin
  • 36,950
  • 8
  • 57
  • 116