As a specific example say I have a table T
with columns customer
and date
indicating days on which individual customers have made purchases:
customer | date
----------------------
A | 01/01/2013
A | 02/01/2013
A | 07/01/2013
A | 11/01/2013
B | 03/01/2013
B | 08/01/2013
I want to add another column that for each pair (customer, date)
pair (c, d)
, gives the number of pairs (c', d')
in T
such that c = c'
and 0 <= days(d) - days(d') <= 7
. Below is the table with this extra column:
customer | date | new_column
----------------------------------
A | 01/01/2013 | 1
A | 02/01/2013 | 2
A | 07/01/2013 | 3
A | 11/01/2013 | 2
B | 03/01/2013 | 1
B | 10/01/2013 | 1
As a rough idea of the steps I used to solve this problem:
- create a table
T'
with all possible pairs(c,d)
; - left join
T
ontoT'
; - create a new column:
count(date) over (partition by customer order by date asc rows between 6 preceding and 0 following)
; - omit any rows from this new table where
T.date is null
However, I don't think this is scalable.
Cheers for any help.