12

I have to change the partition column name (not partition spec), I looked for the commands in hive wiki and some google pages. I can find the options for altering the partition spec,

i.e. For example

In /table/country='US' I can change US to USA, but I want to change country to continent.

I feel like the only option available for changing partition column name is dropping and re-creating the table. Is there is any other option available please help me.

Thanks in advance.

Vickyexpert
  • 3,147
  • 5
  • 21
  • 34
Sunil
  • 131
  • 1
  • 1
  • 6

5 Answers5

4

You can change column name in metadata by following: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-ChangeColumnName/Type/Position/Comment

But as the document says, it only changes the metadata. Hive partitions are implemented as directories with the naming pattern columnName=spec. So you also need to change the names of those directories on HDFS by using "hadoop fs" command.

Kai
  • 125
  • 7
  • 2
    You can change column name but not partition name – mac Jun 25 '18 at 06:44
  • This doesn't answer the question once partition columns can have their name changed. I've got the same problem, when trying to running the change code, I get the following error: `Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Invalid column reference semanaano (state=08S01,code=1)` – Joabe Lucena Nov 06 '18 at 19:35
0

You have alter the partition column using simple swap method.

  • Create a new temp table which is same schema as current table.

  • Move all files in the old table to newly create table location.

    hadoop fs -mv <current_table_name> <temp_table_name>

  • Alter the schema of the original table (Rename or drop the partitions)

  • Recopy/load the temp table data to the original table with appropriate partition values.

    hadoop fs -mv <temp_table_name> <current_table_name>

  • msck repair the the original table & drop the temp_table.

NOTE : mv command move the file from one location to another with reducing the copy time. alternately we can use LOAD DATA INPATH for copy the data to the original table.

Community
  • 1
  • 1
Skanda Shastry
  • 182
  • 6
  • 17
0

You can not change the partition column in hive infact Hive does not support alterting of partitioning columns

You can think of it this way - Hive stores the data by creating a folder in hdfs with partition column values - Since if you trying to alter the hive partition it means you are trying to change the whole directory structure and data of hive table which is not possible exp if you have partitioned on year this is how directory structure looks like

tab1/clientdata/**2009**/file2
tab1/clientdata/**2010**/file3

If you want to change the partition column you can perform below steps

Create another hive table with required changes in partition column

Create table new_table ( A int, B String.....)

Load data from previous table

Insert into new_table partition ( B ) select A,B from table Prev_table
Strick
  • 1,512
  • 9
  • 15
-1

As you said, rename the value for of the partition is very straightforward:

hive> ALTER TABLE test.usage PARTITION (country ='US') RENAME TO PARTITION (date='USA');

I know that this is not what you are looking for. Unfortunately, given that your data is already partitioned by country, the only option you have is to drop the table, remove the data (supposing your table is external) from the HDFS and reinsert the data using continent as partition.

What I would do in your case is to have multiple partition levels, so that your folder structure will look like that:

/path/to/the/data/continent='america'/country='usa'
/path/to/the/data/continent='america'/country='mexico'
/path/to/the/data/continent='europe'/country='spain'
/path/to/the/data/continent='europe'/country='italy'
...

That way you can query the data for different levels of granularity (in this case continent and country).

dbustosp
  • 4,208
  • 25
  • 46
-2

Adding solution here for later:

  • Use case: Change partition column from STRING to INT

    set hive.mapred.mode=norestrict; 
    alter table {table_name} partition column ({column_name} {column_type}); 
    
    e.g. ALTER TABLE employee PARTITION COLUMN dept INT;
    
viczsaurav
  • 37
  • 6