I was trying to create Partition and buckets using HIVE.
For setting some of the properties:
set hive.enforce.bucketing = true;
SET hive.exec.dynamic.partition = true;
SET hive.exec.dynamic.partition.mode = nonstrict;
Below is the code for creating the table:
CREATE TABLE transactions_production
( id string,
dept string,
category string,
company string,
brand string,
date1 string,
productsize int,
productmeasure string,
purchasequantity int,
purchaseamount double)
PARTITIONED BY (chain string) clustered by(id) into 5 buckets
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
STORED AS TEXTFILE;
Below is the code for inserting data into the table:
INSERT OVERWRITE TABLE transactions_production PARTITION (chain)
select id, dept, category, company, brand, date1, productsize, productmeasure,
purchasequantity, purchaseamount, chain from transactions_staging;
What went wrong:
Partitions and buckets are getting created in HDFS but the data is present only in the 1st bucket of all the partitions; all the remaining buckets are empty.
Please let me know what i did wrong and how to resolve this issue.