0

I asked a very general version of this question a while ago. I thought I would have enough programming background to make the jump from the answer to create my function, but turns out I was wrong. This is my first time using R, and I'm having some trouble.

Given the following dataset:

Amount_Bought            CustomerID
12                       28
18                       28
2                        6
9                        6
10                       6

I want to create a column called "average spending" which tabulates the average spending of each customer based on their ID. There is about 1000 entries to the data with varying number of purchases.

For example, for customerID 28, I would want average spending to be (12 + 18)/2 = 15

So, something like this:

Amount_Bought            CustomerID         Average_Spending
12                       28
18                       28                 15
2                        6
9                        6
10                       6                  7

How would I go about doing this? Thank you

user3044487
  • 129
  • 1
  • 5

1 Answers1

1

How about:

library(plyr)
sumdat <- ddply(my_data,"Customer_ID",summarise,
                avg_spending = mean(Amount_Bought))
merge(my_data,sumdat)

(There are a variety of ways to aggregate data in this way in R: ave, aggregate in base R, dplyr package, data.table package ... there are lots of questions on SO comparing efficiency etc. of these various approaches, e.g. Joining aggregated values back to the original data frame )

Community
  • 1
  • 1
Ben Bolker
  • 211,554
  • 25
  • 370
  • 453