I have created a partitioned hive table. I inserted data into this table. Now suppose I execute one select *
query using where
clause then how can i make sure that hive query is using partioning?
Asked
Active
Viewed 816 times
2

leftjoin
- 36,950
- 8
- 57
- 116

Kamal Tomar
- 49
- 2
-
1Please consider adding code and response examples to your question to make it more likely to get constructive answers: For example: what code have you tried? And how did the response deviate from your expected outcome? Were there errors? [Here's a helpful resource.](https://stackoverflow.com/help/mcve) – Michael C. Jun 14 '18 at 14:11
-
1sure, i will make sure it..It was a normal query which doesn't required codes at all – Kamal Tomar Jun 16 '18 at 15:57
2 Answers
0
Assuming that your HIVE data was split on 3 partitions, and each HDFS partition contains a single file that is not bigger than one HDFS block, a SELECT * with a WHERE on a partition value should trigger a single YARN mapper.
Note that this behaviour may be different with new HIVE engines like TEZ and LLAP (Hortonworks) or with Impala (Cloudera)

Harold
- 455
- 4
- 10
0
You can validate partition pruning work by executing EXPLAIN EXTENDED
command.
EXPLAIN EXTENDED select * from mytable where load_date='2018-01-01';
And partition information in the plan looks like:
Path -> Partition: s3n://bucket_name/mytable/load_date=2018-01-01
Partition base file name: load_date=2018-01-01
input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
output format: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
partition values: load_date 2018-01-01

leftjoin
- 36,950
- 8
- 57
- 116