I am learning HADOOP for last 1 months. I am using Partition in hive table. How to do Partition in Pig? It would be very useful for my assessment if any one says the answer. Thanks, Anbu K.
Asked
Active
Viewed 3,452 times
1 Answers
2
Hcatalog
provides metadata and table management layer for hadoop.
It allows Hadoop users—whether they use MapReduce, Pig, Hive, or other tools—to view their data in HDFS as if it were in tables. These tables are partitioned and have consistent schemas.
Pig can work with HCatalog
’s partitioning. If you place the filter statement that describes which partitions you want to read immediately after the load, Pig will push that into the load so that HCatalog
returns only the relevant partitions.
/* myscript.pig */
A = LOAD 'tablename' USING org.apache.hcatalog.pig.HCatLoader();
-- date is a partition column; age is not
B = filter A by date == '20100819' and age < 30;
-- both date and country are partition columns
C = filter A by date == '20100819' and country == 'US';

madhu
- 1,140
- 8
- 14
-
Thanks for your reply, Can you please explain with example input and output data. Thanks, Anbu k – Sep 29 '15 at 06:42
-
A table called 'tablename' is already a partitioned table available in hive. HCatalog uses hive's metastore db for tables./* myscript.pig */ A = LOAD 'tablename' USING org.apache.hcatalog.pig.HCatLoader(); -- date is a partition column; age is not B = filter A by date == '20100819' and age < 30; -- both date and country are partition columns C = filter A by date == '20100819' and country == 'US'; ... ... – madhu Sep 29 '15 at 07:10
-
What i understood is tablename: (Hive) id, name,age,date date - partition column myscript.pig A = LOAD 'tablename' USING org.apache.hcatalog.pig.HCatLoader(); Note: date is a partition column; age is not ? B = filter A by date == '20100819' and age < 30 both date and country are partition columns? C = filter A by date == '20100819' and country == 'US' – Sep 29 '15 at 07:33
-
the table will be having another column country which is also partitioned column. In the first filter condition we are using one partitioned column and one normal column. And after that we are using two partitioned columns in filter condition. Just a sample filter conditions. – madhu Sep 29 '15 at 07:50
-
@Madhu : Would suggest to add the details/ example added in comments to the answer – Murali Rao Oct 01 '15 at 20:33