67

I have some member order data that I would like to aggregate by week of order.

This is what the data looks like:

memberorders=data.frame(MemID=c('A','A','B','B','B','C','C','D'),
             week = c(1,2,1,4,5,1,4,1),
             value = c(10,20,10,10,2,5,30,3))

I'm using dplyr to group_by MemID and summarize "value" for week<=2 and week<=4 (to see how much each member ordered in weeks 1-2 and 1-4. The code I currently have is:

MemberLTV <- memberorders %>%
group_by(MemID) %>%
summarize(
sum2 = sum(value[week<=2]),
sum4 = sum(value[week<=4]))

I'm now trying to add two more fields in summarize, count2 and count4, that would count the number of instances of each condition (week <=2 and week <=4).

The desired output is:

output  = data.frame(MemID = c('A','B','C','D'),
                 sum2 = c(30,10,5,3),
                 sum4 = c(30,20,35,3),
                 count2 = c(2,1,1,1),
                 count4 = c(2,2,2,1))

I'm guessing it's just a little tweak of the sum function but I'm having trouble figuring it out.

Nad Pat
  • 3,129
  • 3
  • 10
  • 20
SFuj
  • 925
  • 1
  • 9
  • 14

4 Answers4

80

Try

 library(dplyr)
 memberorders %>% 
        group_by(MemID) %>% 
        summarise(sum2= sum(value[week<=2]), sum4= sum(value[week <=4]), 
                  count2=sum(week<=2), count4= sum(week<=4))
akrun
  • 874,273
  • 37
  • 540
  • 662
23

Using both previous ideas and keeping it consistent:

library(tidyverse)

MemberLTV_2 <- memberorders %>%
  group_by(MemID) %>%
    summarize(
      count2 = length(value[week<=2]),
      count4 = length(value[week<=4]),
      sum2 = sum(value[week<=2]),
      sum4 = sum(value[week<=4])
      )
iNyar
  • 1,916
  • 1
  • 17
  • 31
Camilo Abboud
  • 879
  • 7
  • 7
4

Using the plyr package one could do

ddply(memberorders,.(MemID),
                    summarise, 
                    val1 = sum(value[week<=2]), 
                    val2 = sum(value[week<=4]),
                    val3 = length(value[week<=2]),
                    val4 = length(value[week<=4]))

  MemID val1 val2 val3 val4
1     A   30   30    2    2
2     B   10   20    1    2
3     C    5   35    1    2
4     D    3    3    1    1
David Arenburg
  • 91,361
  • 17
  • 137
  • 196
Alexey Ferapontov
  • 5,029
  • 4
  • 22
  • 39
-1

Another solution could be with ifelse where you can sum your value or count summing 1.

library(dplyr)
    
memberorders %>% 
  group_by(MemID) %>% 
  summarise(sum2 = sum(ifelse(week <= 2, value, 0)),
            sum4 = sum(ifelse(week <= 4, value, 0)), 
            count2 = sum(ifelse(week <= 2, 1, 0)),
            count4 = sum(ifelse(week <= 4, 1, 0)))
user438383
  • 5,716
  • 8
  • 28
  • 43