Facing difficulties with subset calculations. I am able to get overall stats like average purchase by customer (factor) using ave
, tapply
, ddply
but I am not able to calculate visit by visit stats for each customer. Some simplified data below to illustrate my data and ideal results.
Current Dataframe: (Note that visit #1 is the most recent visit)
customer visit date purchase_amt
sarah 2 2013-08-09 5
sarah 3 2013-07-21 8
sarah 4 2013-06-23 9
sarah 5 2013-06-02 1
sarah 1 2013-08-20 8
henry 1 2013-07-04 4
che 1 2013-08-27 2
che 2 2013-07-27 1
che 3 2013-07-05 8
che 4 2013-06-14 3
dt 3 2013-04-05 9
dt 2 2013-06-07 1
dt 1 2013-07-11 6
These are the results I am seeking:
customer visit date purchase_amt days since amt_diff
sarah 2 2013-08-09 5 19 -3
sarah 3 2013-07-21 8 28 -1
sarah 4 2013-06-23 9 21 8
sarah 5 2013-06-02 1 NA NA
sarah 1 2013-08-20 8 11 3
henry 1 2013-07-04 4 NA NA
che 1 2013-08-27 2 31 1
che 2 2013-07-27 1 22 -7
che 3 2013-07-05 8 21 5
che 4 2013-06-14 3 NA NA
dt 3 2013-04-05 9 NA NA
dt 2 2013-06-07 1 63 -8
dt 1 2013-07-11 6 34 5
So in summary, I would like to find most recent visit of a customer and the attributes of it, then find the next visit attributes and calculate various stats on the two. Return "NA" when there are no more previous visits.