Below is my dataset.
user,device,time_spent,video_start
userA,mob,5,1
userA,desk,5,2
userA,desk,5,3
userA,mob,5,2
userA,mob,5,2
userB,desk,5,2
userB,mob,5,2
userB,mob,5,2
userB,desk,5,2
I want to find out below aggregation for each user.
user total_time_spent device_distribution
userA 20 {mob:60%,desk:40%}
userB 20 {mob:50%,desk:50%}
Can someone help me to achieve this using spark 2.0 API preferably in Java. I have tried using UserDefinedAggregateFunction but it doesn't support group within group as I have to group each user group by device to find aggregated time spent on each device.