This is a small rep of my data:
Team <- rep(c("ind", "sas", "ind", "sas"),c(4,8,2,4))
Player <- c("Paul George", "David West", "Roy Hibbert",
"Paul George", "Tim Duncan", "Manuel Ginobili",
"Tony Parker", "Boris Diaw","Danny Green",
"Kawhi Leonard", "Matt Bonner", "Patty Mills",
"George Hill", "C.J.Miles","Tim Duncan",
"Manuel Ginobili", "Tony Parker", "Boris Diaw")
Team_PTS <- c(101,101,101,98,105,105,105,105,
105,105,105,105,98,98,89,89,89,128)
Date <- as.Date(c("2015-05-14", "2015-05-14", "2015-05-14",
"2015-05-16","2015-05-15", "2015-05-15", "2015-05-15",
"2015-05-15","2015-05-15", "2015-05-15", "2015-05-15",
"2015-05-15","2015-05-16","2015-05-16","2015-05-29",
"2015-05-29","2015-05-29","2015-06-03"))
Team_Gamenumber <- rep(c(1,2,1,2,2,3),c(3,1,8,2,3,1))
df <- data.frame(Team,Player,Team_PTS,Date, Team_Gamenumber)
df
Team Player Team_PTS Date Team_Gamenumber Desired_output
1 ind Paul George 101 2015-05-14 1 101
2 ind David West 101 2015-05-14 1 101
3 ind Roy Hibbert 101 2015-05-14 1 101
4 ind Paul George 98 2015-05-16 2 99.5
5 sas Tim Duncan 105 2015-05-15 1 105
6 sas Manuel Ginobili 105 2015-05-15 1 105
7 sas Tony Parker 105 2015-05-15 1 105
8 sas Boris Diaw 105 2015-05-15 1 105
9 sas Danny Green 105 2015-05-15 1 105
10 sas Kawhi Leonard 105 2015-05-15 1 105
11 sas Matt Bonner 105 2015-05-15 1 105
12 sas Patty Mills 105 2015-05-15 1 105
13 ind George Hill 98 2015-05-16 2 99.5
14 ind C.J.Miles 98 2015-05-16 2 99.5
15 sas Tim Duncan 89 2015-05-29 2 97
16 sas Manuel Ginobili 89 2015-05-29 2 97
17 sas Tony Parker 89 2015-05-29 2 97
18 sas Boris Diaw 128 2015-06-03 3 107.33
The desired output variable is the moving or cummulative average of the Team points (sas and ind in this example).
I have tried:
library(dplyr)
df %>% group_by(Team) %>%
mutate(cumavg_PTS = cumsum(Team_PTS) / seq_along(Team_PTS))
However that yields a wrong output since the information is organized by players. See Boris Diaw misses game 2 with sas but plays on game 3.
Also I think cumsum
is not the right approach in this case because the average will be affected by the number of players that play every single match.
The 107.33 comes from the average of sas first 3 games (105 + 89 + 128)/3