I have two columns, one of products, and one of the dates they were bought. I am able to order the dates by applying the sort_array(dates) function, but I want to be able to sort_array(products) by the purchase date. Is there a way to do that in Hive?
Tablename is
ClientID Product Date
100 Shampoo 2016-01-02
101 Book 2016-02-04
100 Conditioner 2015-12-31
101 Bookmark 2016-07-10
100 Cream 2016-02-12
101 Book2 2016-01-03
Then, getting one row per customer:
select
clientID,
COLLECT_LIST(Product) as Prod_List,
sort_array(COLLECT_LIST(date)) as Date_Order
from tablename
group by 1;
As:
ClientID Prod_List Date_Order
100 ["Shampoo","Conditioner","Cream"] ["2015-12-31","2016-01-02","2016-02-12"]
101 ["Book","Bookmark","Book2"] ["2016-01-03","2016-02-04","2016-07-10"]
But what I want is the order of the products to be tied to the correct chronological order of purchases.