2

I am trying to sum sales on my table grouped for next 30 days.

Here is the data I have.

date sales
2021-08-08 35
2021-08-08 14
2021-08-11 35
2021-09-09 22
2021-09-21 44
2021-10-16 46
2021-10-25 9
2021-10-25 1
2021-10-25 2
2021-10-25 6
2021-11-04 1
2021-11-07 1

I am expecting the result like this

date sales total 30d
2021-08-08 14 84
2021-08-08 35 84
2021-08-11 35 57
2021-09-09 22 66
2021-09-21 44 90
2021-10-16 46 76
2021-10-25 9 30
2021-10-25 6 30
2021-10-25 2 30
2021-10-25 1 30
2021-11-04 1 12
2021-11-07 1 29

Explanation: For date 2021-08-08 it's going to sum the sales from date >= 2021-08-08 til 30 days after which is 2021-09-07 so it would be 14+35+35 (2021-08-08, 2021-08-08 and 2021-08-11).

I got my work around using this query, but I think it is very inefficient and I believe there's other way using windows function though I can't wrap my head around it.

with temp as (
    select date, sum(sales) as sales_total
from test_table
group by 1,2
)
, temp2 as (
SELECT a.date, SUM(b.sales_total) total
FROM temp a, temp b
WHERE b.date >= a.date AND b.date <= a.date + interval '30' day
GROUP BY a.date
)
select a.date, a.sales, b.total 
from test_table a 
JOIN temp2 b on a.date = b.date

Any pointer to solve this problem using windows function or any other more efficient way is appreciated. Thanks!

3 Answers3

2

Here is a way to do this using recursive cte.

with data
   as (select start_dt,sales,row_number() over(order by start_dt) as rnk       
         from t1
       )
     ,cte_data
    as ( select d.start_dt,d.sales,d.rnk,d.start_dt as grp_dt
           from data d
          where rnk=1
        union all
        select d.start_dt,d.sales,d.rnk,case when datediff(day,d2.grp_dt,d.start_dt)>30 then 
                                           d.start_dt
                                     else d2.grp_dt
                                 end
          from cte_data d2
          join data d
            on d.rnk=d2.rnk+1
       )
  select min(start_dt) start_range,max(start_dt) end_range,sum(sales) as sum_sales
    from cte_data
 group by grp_dt

+-------------+------------+-----------+
| start_range | end_range  | sum_sales |
+-------------+------------+-----------+
| 2021-08-08  | 2021-08-11 |        84 |
| 2021-09-09  | 2021-09-21 |        66 |
| 2021-10-16  | 2021-11-07 |        66 |
+-------------+------------+-----------+

sql fiddle link https://dbfiddle.uk/2LmQe5n5

To elaborate a bit on this. We would need some kind of recursive process to do this, as the results of the following row depends on the previous entry date, and so I doubt we could do this using window functions.

George Joseph
  • 5,842
  • 10
  • 24
1

It seems that you have an overlapping groups problem, you could use a self-join and aggregate as the following:

SELECT T.date, T.sales,
       SUM(D.sales) AS total_30d 
FROM
test_table T JOIN test_table D
ON D.date BETWEEN T.date AND T.date + interval '30' day
GROUP BY T.date, T.sales
ORDER BY T.date, T.sales

See a demo.

ahmed
  • 9,071
  • 3
  • 9
  • 22
1

Straightforward using a scalar subquery:

select ext.*, 
(
  select sum(sales) from the_table
  where "date" between ext.date and ext.date + 30
) as total_30d 
from the_table as ext;

DB-fiddle demo.

date sales sum
2021-08-08 35 84
2021-08-08 14 84
2021-08-11 35 57
2021-09-09 22 66
2021-09-21 44 90
2021-10-16 46 66
2021-10-25 9 20
2021-10-25 1 20
2021-10-25 2 20
2021-10-25 6 20
2021-11-04 1 2
2021-11-07 1 1
Stefanov.sm
  • 11,215
  • 2
  • 21
  • 21
  • This solution groups records on the basis of the its begin date, and does not consider if the entry has already been accounted for. – George Joseph Dec 09 '22 at 22:56
  • @AndriWijaya Consider this scenario when the second record is 2021-08-09. This entry belongs to the first 30 days(2021-08-08,2021-08-09,2021-08-11). And it also belongs to the group (2021-08-09,2021-08-11,2021-09-09). If this is acceptable scenario, then its a cool solution. See Demo https://dbfiddle.uk/HeRSzuWK – George Joseph Dec 09 '22 at 23:03