-2

this is my dataframe:

x day month
5 1 1
4 1 1 
1 2 1
3 2 1
5 1 2
2 1 2
5 2 2
3 2 2

I need to take the sum of x values for each day in each month. I already have tried:

tapply(DF$x, DF$day, max) 

but it is not giving the right answers.

Madamespring
  • 35
  • 1
  • 7

2 Answers2

2

Try the data.table package:

library(data.table)
DT<-data.table(df)
DT[, list(Sum=sum(x)), by = c("day","month")]

    day month Sum
1:   1     1  9
2:   2     1  4
3:   1     2  7
4:   2     2  8

OR use the sqldf package:

sqldf("select  day, month, sum(x) as sum from DT group by day, month")

OR using the base aggregate function:

aggregate(DT$x, FUN=sum, by = list(DT$day, DT$month))

a more cleaner way suggested by Frank:

aggregate(x~day+month, DT, sum)

OR using the dplyr package: (As suggested by Frank)

DT %>% 
    group_by(day,month) %>% 
    summarise(Sum = sum(x))
akrun
  • 874,273
  • 37
  • 540
  • 662
CuriousBeing
  • 1,592
  • 14
  • 34
2

As the question title is about tapply and the right answer is not in the OP's post, if we need a cross-tabular version, one option with tapply would be to place the grouping variables in a list and specify the FUN as sum

with(DF, tapply(x, list(day, month), FUN=sum))
#  1 2
#1 9 7
#2 4 8

Or this can be done with xtabs. The default option is sum

xtabs(x~day+month, DF)
#    month
#day 1 2
#   1 9 7
#   2 4 8

Or with by

by(DF[1], DF[-1], FUN= sum)
akrun
  • 874,273
  • 37
  • 540
  • 662