2

I want to compute trailing mean GPA by Name from the table below at each index. For example at index 9, I want to have the mean GPA for C from indices 3, 6, 9 and likewise for A and B. At index 6 I want to have the GPA for C from indices 3 and 6. tapply alone won't solve my problem because I want a value for each row.

What's a clean way of doing this ?

Index Name GPA
1     A    5
2     B    6
3     C    7
4     A    5
5     B    6
6     C    6
7     A    7
8     B    6
9     C    3
C8H10N4O2
  • 18,312
  • 8
  • 98
  • 134
user196711
  • 311
  • 5
  • 17

2 Answers2

3

Minimally reproducible example:

g <- data.frame(Name=rep(c("A","B","C"),3),
                GPA=c(5,6,7,5,6,6,7,6,3))

For a cumulative average, I would just use cumsum() and seq_along() restricted to rows with the same Name.

for(i in unique(g$Name)){
  gpa <- g$GPA[g$Name==i]
  g$cum_avg[g$Name==i] <- cumsum(gpa)/seq_along(gpa)
}
g

Produces:

  Name GPA  cum_avg
1    A   5 5.000000
2    B   6 6.000000
3    C   7 7.000000
4    A   5 5.000000
5    B   6 6.000000
6    C   6 6.500000
7    A   7 5.666667
8    B   6 6.000000
9    C   3 5.333333
Community
  • 1
  • 1
C8H10N4O2
  • 18,312
  • 8
  • 98
  • 134
0

Using dplyr:

g <- data.frame(Name=rep(c("A","B","C"),3),
            GPA=c(5,6,7,5,6,6,7,6,3))

g %>%
group_by(Name) %>%
mutate(cumu = lag(cummean(GPA), n = 0))

Output:

Source: local data frame [9 x 3]
Groups: Name [3]

Name   GPA     cumu
<fctr> <dbl>    <dbl>
1      A     5 5.000000
2      B     6 6.000000
3      C     7 7.000000
4      A     5 5.000000 
5      B     6 6.000000
6      C     6 6.500000
7      A     7 5.666667
8      B     6 6.000000
9      C     3 5.333333
Jash Shah
  • 2,064
  • 4
  • 23
  • 41