I want to create five new columns that count how often a certain value of "stars" has happened for this business before this particular row (i.e., summing up over all rows with a smaller rolingcount but holding the business constant).
For the first row of each business (i.e., where rolingcount == 0), it should be NA, because there have been no previous occurrences for this business.
Here is an exemplary dataset:
business <-c("aaa","aaa","aaa","bbb","bbb","bbb","bbb","bbb","ccc","ccc","ccc","ccc","ccc","ccc","ccc","ccc")
rolingcount <- c(1,2,3,1,2,3,4,5,1,2,3,4,5,6,7,8)
stars <- c(5,5,3,5,5,1,2,3,5,1,2,3,4,5,5)
df <- cbind(business, rolingcount, stars)
I feel my problem is related to this, but with a gist, that I don't get to work: Numbering rows within groups in a data frame
I also unsuccessfully experimented with while loops.
Ideally, something like this will be the output. (I leave out previousthree, previoustwo, previousone, because I believe they will work identical).
business <- c("aaa","aaa","aaa","bbb","bbb","bbb","bbb","bbb","ccc","ccc","ccc","ccc","ccc","ccc","ccc","ccc")
rolingcount <- c(1,2,3,1,2,3,4,5,1,2,3,4,5,6,7,8)
stars <- c(5,5,3,5,5,1,2,3,5,1,2,3,4,5,5)
previousfives <- c(NA,1,2,NA,1,2,2,2,NA,1,1,1,1,1,2,3)
previousfours <- c(NA,0,0,NA,0,0,0,0,NA,0,0,0,0,1,1,1)
df <- cbind(business, rolingcount, stars, previousfives, previousfours)`
Since, I will have to do this for more than 10 M rows, a fast option would be cool. Your help is much appreciated! :)