2

I have created a partitioned hive table. I inserted data into this table. Now suppose I execute one select * query using where clause then how can i make sure that hive query is using partioning?

leftjoin
  • 36,950
  • 8
  • 57
  • 116
  • 1
    Please consider adding code and response examples to your question to make it more likely to get constructive answers: For example: what code have you tried? And how did the response deviate from your expected outcome? Were there errors? [Here's a helpful resource.](https://stackoverflow.com/help/mcve) – Michael C. Jun 14 '18 at 14:11
  • 1
    sure, i will make sure it..It was a normal query which doesn't required codes at all – Kamal Tomar Jun 16 '18 at 15:57

2 Answers2

0

Assuming that your HIVE data was split on 3 partitions, and each HDFS partition contains a single file that is not bigger than one HDFS block, a SELECT * with a WHERE on a partition value should trigger a single YARN mapper.

Note that this behaviour may be different with new HIVE engines like TEZ and LLAP (Hortonworks) or with Impala (Cloudera)

Harold
  • 455
  • 4
  • 10
0

You can validate partition pruning work by executing EXPLAIN EXTENDED command.

EXPLAIN EXTENDED select * from mytable where load_date='2018-01-01';

And partition information in the plan looks like:

Path -> Partition: s3n://bucket_name/mytable/load_date=2018-01-01 
Partition base file name: load_date=2018-01-01 
input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat 
output format: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat 
partition values: load_date 2018-01-01
leftjoin
  • 36,950
  • 8
  • 57
  • 116