1

If I run

set hivevar:a = 1;
select * from t1 where partition_variable=${a};

Hive only pulls in the records from the appropriate partition. Alternately if I run

set hivevar:b = 6;
set hivevar:c = 5;
set hivevar:a = ${b}-${c};
select * from t1 where partition_variable=${a};

The condition on partition_variable is treated as a predicate rather than a partition, and hive goes through all records in the table.

This is obviously a contrived example, but in my particular use case it is necessary. Is there anyway to force hive to use this for partitioning?

Thanks in advance.

2 Answers2

0

Is the partition variable the column on which partition occurs. It works with following.

create table newpart
(productOfMonth string)
partitioned by (month int);


hive> select * from newpart;
OK
Cantaloupes 10 
Pumpkin     11

set hivevar:lastmonth = 11;
set hivevar:const = 1;
set hivevar:prevmonth = ${lastmonth}-${const};

hive> select * from newpart
    > where month = ${prevmonth};
OK
Cantaloupes 10
Sagar Shah
  • 118
  • 4
  • `partition_variable` is the column on which partitioning occurs. The issue isn't whether it works or not, indeed it will return the right answer. The issue is that it uses `partition_variable` as a predicate rather than only looking at the right partition. So it looks through almost 10000x the number of records it needs to. – Salman Iftikhar Sep 01 '16 at 18:57
  • Can you provide the explain plan? – Sagar Shah Sep 02 '16 at 16:25
0

I was never able to get partitioning to work properly with dynamically generated hive variables, but a simple workaround was to create a table containing the variables and join on them rather than using them in the where clause.