aggregate sum and duplicates in r

Question

I've seen similar questions being asked but I am unable to apply it to my own data. I am trying to aggregate Value, Value 2 and Value 3 by Product ID and Revenue where the aggregate for Value is sum. However; I only want value2 and value3 to show one value for the duplicates

Here is my code:

aggregate(Value, Value2, Value3 ~product_id + Revenue, dat,sum)

The data:

dat <-structure(list(product_id = c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L),
               Date = c("January", "February", "March", "January", "February", "March", "January", "February", "March", "January", "February", "March"),
               Revenue = c("in", "in", "in", "out", "out", "out", "in", "in", "in", "out", "out", "out"),
               Value = c(0L, 1L, 0L, 0L, 0L, 0L, 1L, 2L, 3L, 0L, 0L, 0L),
           Value2 = c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 4L, 4L, 4L),
           Value3 = c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 4L, 4L, 4L) 
             ),
          .Names = c("product_id",  "Date", "Revenue", "Value", "Value2", "Value3"),
          class = "data.frame", row.names = c(NA, -12L))

So that it would look like:

product i_d Revenue Value Value2 Value 3 
1           in      1     1      1
2           in      6     2      2 
1           out     0     3      3 
2           out     0     4      4

There is a typo in your dat definition. You do not hhave Value2, but Value3 is defined twice. — lmo, Jun 27 '16 at 21:57
Once you fix that, the following will work: `aggregate(cbind(Value, Value2, Value3) ~product_id + Revenue, data=dat,sum)`. You can use `cbind` to include multiple variables to be computed. — lmo, Jun 27 '16 at 21:57

score 0 · Answer 1 · answered Jun 27 '16 at 22:16

dat <-structure(list(product_id = c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L),
                     Date = c("January", "February", "March", "January", "February", "March", "January", "February", "March", "January", "February", "March"),
                     Revenue = c("in", "in", "in", "out", "out", "out", "in", "in", "in", "out", "out", "out"),
                     Value = c(0L, 1L, 0L, 0L, 0L, 0L, 1L, 2L, 3L, 0L, 0L, 0L),
                     Value2 = c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 4L, 4L, 4L),
                     Value3 = c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 4L, 4L, 4L) 
),
.Names = c("product_id",  "Date", "Revenue", "Value", "Value2", "Value3"),
class = "data.frame", row.names = c(NA, -12L))

res <- aggregate(dat[,colnames(dat) %in% c("Value", "Value2", "Value3")],by=list(dat$product_id, dat$Revenue),FUN=sum)

colnames(res) <- c("product_id", "Revenue", "Value", "Value 2", "Value 3")
res
  product_id Revenue Value Value 2 Value 3
1          1      in     1       3       3
2          2      in     6       9       9
3          1     out     0       6       6
4          2     out     0      12      12

aggregate sum and duplicates in r

1 Answers1