when i run this sql, it takes about half a hour and it's unacceptable. the source table have 20M records.
each rank() create a mapreduce job running for 7 minutes with all the data in one reduce. how to optimize this sql? can these rank mapreduce run in parallel?
create table tmp.xxxx as
SELECT t.end_user_id,
t.pm_net_amt,
t.parnt_ordr_num,
rank() over(order by t.pm_net_amt) pm_net_amt_rank,
rank() over(order by t.parnt_ordr_num) parnt_ordr_num_rank,
rank() over(order by t.cat_lvl3_num) cat_lvl3_num_rank,
rank() over(order by t.discount_rate) discount_rate_rank
from bi_tmp.bbbbbb t;