1

I have trouble when using hive conf in INSERT INTO a partition table.

Below is my code

set PERIOD = 202305;
set DAY=concat(substr(${hiveconf:PERIOD},1,4),'-',substr(${hiveconf:PERIOD},5,2),'-01');
-- DAY result = 20230501
set COMPUTE_MONTH = date_format(add_months(from_unixtime(unix_timestamp(${hiveconf:DAY}, 'yyyy-MM-dd'), 'yyyy-MM-dd'), -1), 'yyyyMM');
--COMPUTE_MONTH = 202304

-- Case 1
ALTER TABLE db.table DROP PARTITION (snapshot = ${hiveconf:PERIOD});
INSERT INTO TABLE db.table PARTITION (snapshot = ${hiveconf:PERIOD})
SELECT
...

-- Case 2
ALTER TABLE db.table DROP PARTITION (snapshot = ${hiveconf:COMPUTE_MONTH});
INSERT INTO TABLE db.table PARTITION (snapshot = ${hiveconf:COMPUTE_MONTH})
SELECT
...
  • Case 1 run well with hiveconf PERIOD
  • Case 2 return error as attached image case_2_log_returned

I need help, how to use COMPUTE_MONTH (previous month of PERIOD) in INSERT INTO a partition table in this case.

Thanks in advance.

Ftu_Sneaky
  • 19
  • 3

1 Answers1

1

Variables in Hive are not being calculated, they are working simply as text substitution. Your COMPUTE_MONTH is not calculated in Hive, it is equal the whole text expression: date_format(add_months(from_unixtime(unix_timestamp(${hiveconf:DAY}, 'yyyy-MM-dd'), 'yyyy-MM-dd'), -1), 'yyyyMM') and substituted as is in the partition specification.

Such expressions are not allowed in partition spec in Hive. Maximum what you can use in partition specification is scalar variable and <, >, = operators.

Hive does not calculate variables.

The workaround is to calculate everything in shell and pass to Hive as a parameters, for example like in this answer.

leftjoin
  • 36,950
  • 8
  • 57
  • 116