I have the following dataset:
adv_id<-c("abc","abc","abc","pqr","pqr","pqr","xyz","xyz","xyz")
seg_id<-c("1","2","3","4","5","6","7","8","9")
value<-c(120,450,108,1004,567,768,111,222,3334)
data<-data.frame(adv_id,seg_id,value)
I want to calculate parent total percent on adv_id level. i.e. like when we use the excel pivot. So my output should look like
adv_id seg_id value percentage
abc 1 120 17.7%
abc 2 450 66.37%
abc 3 108 15.93%
pqr 4 1004 42.92%
pqr 5 567 24.24%
pqr 6 768 32.83%
xyz 7 111 3.03%
xyz 8 222 6.05%
xyz 9 3334 90.92%
I used the following code
totaluser=0
percent=c()
for (i in 2:nrow(data))
{
if data$adv_id[i]==data$adv_id[i-1]
{
totaluser = totaluser+rawdata$value[i]
percentage[i-1] = (data$value[i-1]/totaluser)*100
}
else {totaluser=0}
}
and using sqldf package
percentage<-sqldf("select adv_id,seg_id,value,sum(value)/sum(sum(value)) over(partition by adv_id) as 'percentage' from data").
But both times i cannot get the desired output. Can we use the summarise function in dplyr package somehow?