I'd like to create a weighted average from one data frame and add it to another. Typically, I do this in SQL, but I cannot in this case. The example I give here is very simplified.
The first data frame is called Customer. It has a CustomerID column.
Customer <- data.frame(
CustomerID = sample(1:10)
)
The second data frame is called Order. It has four fields: CustomerID, Year, Weight, and TotalCost.
Order <- data.frame(
CustomerID = sample(1:9, 100, replace=TRUE),
Year = sample(2014:2016, 100, replace=TRUE),
Weight = sample(1:3, 100, replace=TRUE),
TotalCost = sample(200:400, 100, replace=TRUE)
)
I'd like to add a column WeightedCost to the Customer data frame that is the weighted average TotalCost for that customer, calculated Sum(Weight*TotalCost)/Sum(Weight) limited by the CustomerID field and where the Year > 2015.
I've looked at looping thru the Customer table, but I'm pretty sure there is a better vectorized solution.
Additionally (late edit), I'd like to know how to implement this if the year comes from the Customer table rather than explicitly defined. Here is the new Customer data frame:
c = c(1,1,2,2,3,3,4,4,5,5)
y = c(2014,2015,2014,2015,2014,2015,2014,2015,2014,2015)
Customer <- data.frame(
CustomerID = c,
Year = y
)