1

I am trying to move the location of a table to a new directory. Let's say the original location is /data/dir. For example, I am trying something like this:

hadoop fs -mkdir /data/dir_bkp
hadoop fs -mv /data/dir/* /data/dir_bkp

I then do hive commands such as:

ALTER TABLE db.mytable RENAME TO db.mytable_bkp;
ALTER TABLE db.mytable_bkp SET LOCATION /data/dir_bkp;

Is it fine to move the directory files before changing the location of the table? After I run these commands, will the table mytable_bkp be populated as it was before?

leftjoin
  • 36,950
  • 8
  • 57
  • 116
formicaman
  • 1,317
  • 3
  • 16
  • 32
  • It is partitioned. So I would just need to add `MSCK REPAIR TABLE db.mytable_bkp` at the end and the backup table will still be populated even if moving the files first? – formicaman Feb 10 '21 at 15:28
  • Thanks. That was location was a typo. But if the table is loaded if the location contains data, wouldn't the table be empty if I first move the files? Since there would be no data at that location before I move the location of the table. – formicaman Feb 10 '21 at 15:47
  • 1
    @leftjoin Thanks for the help. I did the above commands, and can see my data moved from `dir` to `dir_bkp`, but even after running `msck repair table db.mytable_bkp`, the table is empty. – formicaman Feb 10 '21 at 17:00
  • check SHOW PARTITIONS - are partitions mounted to old locations? – leftjoin Feb 10 '21 at 17:22
  • Is table EXTERNAL or Managed? If it is EXTERNAL, you can try drop and create and run MSCK again I guess it can be the issue with partitions mounted to old locations. If the table is MANAGED - make it external before dropping – leftjoin Feb 10 '21 at 17:25
  • It is an external table. I can see the partitions, but you're correct -- they are in the old locations. Is the way to fix this by dropping and recreate the table? – formicaman Feb 10 '21 at 18:03
  • Sorry for misleading you previously with rename table. Explained this scenario in the answer.. – leftjoin Feb 11 '21 at 10:32

1 Answers1

1

After you executed mv command, your original table will become empty. because mv removed data files.

After you renamed table, it is empty, because it's location is empty.

After you executed ALTER TABLE SET LOCATION - the table is empty because partitions are mounted to old locations (now empty). Sorry for misleading you in this step previously. After rename table, partitions remain as they were before rename. Each partition can normally have it's own location outside table location.

If table is MANAGED, make it EXTERNAL:

alter table table_name SET TBLPROPERTIES('EXTERNAL'='TRUE'); 

Now drop table + create table with new location and run MSCK to create partitions:

MSCK [REPAIR] TABLE tablename;

If you are on Amazon EMR, run ALTER TABLE tablename RECOVER PARTITIONS; instead of MSCK

leftjoin
  • 36,950
  • 8
  • 57
  • 116