0
smalldat <- data.frame(group1 = rep(1:2, c(5,5)),
                       group2 = rep(c("a","b"), 5),
                       x = rnorm(10))

smalldat
#    group1 group2          x
# 1       1      a -1.2173399
# 2       1      b  0.2601609
# 3       1      a -1.9955389
# 4       1      b -0.7949134
# 5       1      a  0.9655160
# 6       2      b -1.2307946
# 7       2      a  0.3562118
# 8       2      b  0.7674343
# 9       2      a -0.2472418
# 10      2      b -1.2653220
 a<-group_by(smalldat,group1)
 summarize(a,mm=mean(x))
 #      mm
 # 1 -0.1690133

so, why do i get the mean of all x, instead of the mean of 1 and 2? Thank you

Pierre L
  • 28,203
  • 6
  • 47
  • 69
botwithtom
  • 51
  • 3
  • 2
    I tried your code and it's working for me. – ytk Mar 04 '16 at 02:22
  • 1
    Your code is also working on my machine too. I have dplyr_0.4.3 on my machine. – jazzurro Mar 04 '16 at 02:25
  • Despite my answer, your code also works for me (0.4.3); seems you have a typo so I'm voting to close unless you can clarify. – MichaelChirico Mar 04 '16 at 02:27
  • 12
    Probably you loaded `plyr` after `dplyr`, ignored the Warning that prints when you do that, and you're using `plyr::summarize` instead of `dplyr::summarize`. – Gregor Thomas Mar 04 '16 at 02:29
  • Ohhhhhhh, ive always wondered why this happened!, thanks gregor! – InfiniteFlash Mar 04 '16 at 03:25
  • 1
    A possible canonical answer for @Gregor's comment: [Why does summarise on grouped data result in only overall summary in dplyr?](http://stackoverflow.com/questions/26106146/why-does-summarise-on-grouped-data-result-in-only-overall-summary-in-dplyr) – Henrik Mar 04 '16 at 08:53
  • 1
    @InfiniteFlashChess Makes me think of `fortunes::fortune(9)`. – Gregor Thomas Mar 04 '16 at 08:59

2 Answers2

1

You need to break out the pipes.

smalldat %>% group_by(group1) %>% summarize(mm = mean(x))

# Source: local data frame [2 x 2]
# 
#   group1         mm
#    (int)      (dbl)
# 1      1 -0.5564231
# 2      2 -0.3239425

(requisite data.table plug: I find this more readable):

library(data.table); setDT(smalldat)

smalldat[ , mean(x), by = group1]

#or, named:
smalldat[ , .(mean(x)), by = group1]
MichaelChirico
  • 33,841
  • 14
  • 113
  • 198
0

As an alternative, we can use aggregate from base R

aggregate(x~group1, smalldat, mean)
# group1         x
#1      1 0.2487354
#2      2 0.2275124
akrun
  • 874,273
  • 37
  • 540
  • 662