2

I have a folder structure in HDFS like below. However, no partitions were actually created on the table using the ALTER TABLE ADD PARTITION commands, even though the folder structure was setup as if the table had partitions.

How can I automatically add all the partitions to the Hive table? (Hive 1.0, external table)

/user/frank/clicks.db
  /date=20190401
    /file0004.csv
  /date=20190402
    /file0009.csv
  /date=20190501
    /file0000.csv
    /file0001.csv
...etc
leftjoin
  • 36,950
  • 8
  • 57
  • 116
lifebythedrop
  • 401
  • 3
  • 18

1 Answers1

3

Use msck repair table command:

MSCK [REPAIR] TABLE tablename;

or

ALTER TABLE tablename RECOVER PARTITIONS;

if you are running Hive on EMR.

Read more details about both commands here: RECOVER PARTITIONS

leftjoin
  • 36,950
  • 8
  • 57
  • 116
  • Ok, this is great - beats writing a bash/beeline script. Are there any downsides to using this as way to pick up new partitions on a regular basis? – lifebythedrop Jul 16 '19 at 02:22
  • 1
    @lifebythedrop Adding only newly added partition using alter table add partition may work a bit faster because it will not scan all table directory and fetch all partition metadata to compare with what already added. Also using alter table add partition you can add partition outside table location. If you do not need to add partitions located outside table location, recover works fine. – leftjoin Jul 16 '19 at 05:46