Can someone help me how to calculate the sum of a coloumn until it reaches a certain value. Usecase: top product which produced 50% of the revenue.
Is there any library like piggybank to get it done, I couldn't find it in piggybank.
I am trying to implement UDF but I am worried is that the only way :(.
Here is the data structure looks like-
productId, totalProfitByProduct, totalProfitByCompany, totalRevenueOfCompany.
Data is in descending order on totalProfitByProduct. totalProfitByCompany, totalRevenueOfCompany remains same for every row.
Now I want to apply sum over totalProfitByProduct for each product above from the top and get the top products which generated greater than 50% of totalProfitByCompany or totalRevenueOfCompany