why does summarize function only give one result?

Question

smalldat <- data.frame(group1 = rep(1:2, c(5,5)),
                       group2 = rep(c("a","b"), 5),
                       x = rnorm(10))

smalldat
#    group1 group2          x
# 1       1      a -1.2173399
# 2       1      b  0.2601609
# 3       1      a -1.9955389
# 4       1      b -0.7949134
# 5       1      a  0.9655160
# 6       2      b -1.2307946
# 7       2      a  0.3562118
# 8       2      b  0.7674343
# 9       2      a -0.2472418
# 10      2      b -1.2653220
 a<-group_by(smalldat,group1)
 summarize(a,mm=mean(x))
 #      mm
 # 1 -0.1690133

so, why do i get the mean of all x, instead of the mean of 1 and 2? Thank you

Your code is also working on my machine too. I have dplyr_0.4.3 on my machine. — jazzurro, Mar 04 '16 at 02:25
Despite my answer, your code also works for me (0.4.3); seems you have a typo so I'm voting to close unless you can clarify. — MichaelChirico, Mar 04 '16 at 02:27
Probably you loaded `plyr` after `dplyr`, ignored the Warning that prints when you do that, and you're using `plyr::summarize` instead of `dplyr::summarize`. — Gregor Thomas, Mar 04 '16 at 02:29
Ohhhhhhh, ive always wondered why this happened!, thanks gregor! — InfiniteFlash, Mar 04 '16 at 03:25
A possible canonical answer for @Gregor's comment: [Why does summarise on grouped data result in only overall summary in dplyr?](http://stackoverflow.com/questions/26106146/why-does-summarise-on-grouped-data-result-in-only-overall-summary-in-dplyr) — Henrik, Mar 04 '16 at 08:53
@InfiniteFlashChess Makes me think of `fortunes::fortune(9)`. — Gregor Thomas, Mar 04 '16 at 08:59

score 1 · Answer 1 · answered Mar 04 '16 at 02:22

1

You need to break out the pipes.

smalldat %>% group_by(group1) %>% summarize(mm = mean(x))

# Source: local data frame [2 x 2]
# 
#   group1         mm
#    (int)      (dbl)
# 1      1 -0.5564231
# 2      2 -0.3239425

(requisite data.table plug: I find this more readable):

library(data.table); setDT(smalldat)

smalldat[ , mean(x), by = group1]

#or, named:
smalldat[ , .(mean(x)), by = group1]

answered Mar 04 '16 at 02:22

MichaelChirico

33,841
14
113
198

2

Pipes are just syntactical sugar; they're not necessary. – Gregor Thomas Mar 04 '16 at 02:28
@Gregor realized in retrospect. OP's code looked fine, but I assumed something was wrong since I'm used to seeing pipes and he was getting wrong output. – MichaelChirico Mar 04 '16 at 02:29

score 0 · Answer 2 · answered Mar 04 '16 at 03:49

0

As an alternative, we can use aggregate from base R

aggregate(x~group1, smalldat, mean)
# group1         x
#1      1 0.2487354
#2      2 0.2275124

answered Mar 04 '16 at 03:49

akrun

874,273
37
540
662

why does summarize function only give one result?

2 Answers2