1

I'm sure this question has been asked before, but I can't find the answer.

Here's my data:

df <- data.frame(group=c("a","a","a","b","b","c"), value=c(1,2,3,4,5,7))
df
#>   group value
#> 1     a     1
#> 2     a     2
#> 3     a     3
#> 4     b     4
#> 5     b     5
#> 6     c     7

I'd like a 3rd column which has the sum of "value" for each "group", like so:

#>   group value group_sum
#> 1     a     1         6
#> 2     a     2         6
#> 3     a     3         6
#> 4     b     4         9
#> 5     b     5         9
#> 6     c     7         7

How can I do this with dplyr?

J. Mini
  • 1,868
  • 1
  • 9
  • 38
Rez99
  • 359
  • 1
  • 4
  • 15

4 Answers4

3

Using dplyr -

df %>%
    group_by(group) %>%
    mutate(group_sum = sum(value))
Hearkz
  • 99
  • 3
3

Nobody mentioned data.table yet:

library(data.table)

dat <- data.table(df)

dat[, `:=`(sums = sum(value)), group]

Which transforms dat into:

   group value sums
1:     a     1    6
2:     a     2    6
3:     a     3    6
4:     b     4    9
5:     b     5    9
6:     c     7    7
utubun
  • 4,400
  • 1
  • 14
  • 17
2
left_join(
  df,
  df %>% group_by(group) %>% summarise(group_sum = sum(value)),
  by = c("group")
)
lkq
  • 2,326
  • 1
  • 12
  • 22
1

I don't know how to do it one step, but

df_avg <- df %>% group_by(group) %>% summarize(group_sum=sum(value))  
df %>% full_join(df_avg,by="group")

works. (This is basically equivalent to @KeqiangLi's answer.)

ave(), from base R, is useful here too:

df %>% mutate(group_sum=ave(value,group,FUN=sum))
Ben Bolker
  • 211,554
  • 25
  • 370
  • 453