-1

This is redshift SQL I'm trying to get 2 results for a week:

  1. Total records in that week
  2. Total records ageing greater than 7 days from that week.

say there are sample 100 records in below format, in current example 7 records/week:

day         code    week
1/1/2020    P001    1
1/2/2020    P002    1
1/3/2020    P003    1
1/4/2020    P004    1
1/5/2020    P005    2
1/6/2020    P006    2
1/7/2020    P007    2
1/8/2020    P008    2
1/9/2020    P009    2
1/10/2020   P010    2
1/11/2020   P011    2
.....................
4/8/2020    P099    15

Trying to get output like this:

Week count count>7 days
1     7     0
2     7     7
3     7     14
4     7     21
15    7     98

Basically for the latest week, i'm trying to get distinct number of records ageing more than 7 days. In actual use case, the number of records in week will vary.

What i've tried:

calendar_week_number,
count(code) as count 1,
count(DISTINCT (case when datediff(day, trunc(completion_date-7), '2020-01-01') then code end)) as count 2,
count(case when completion_date between TO_DATE('20200101','YYYYMMDD') and TO_DATE(completion_date,'YYYYMMDD')-7 then code end) as count 3

from rbsrpt.RBS_DAILY_ASIN_PROC_SNPSHT ul
    LEFT JOIN rbsrpt.dim_rbs_time t  ON Trunc(ul.completion_date) = trunc(t.cal_date)

where
mp=1
and calendar_year=2020

group by
calendar_week_number

order by calendar_week_number desc

but my output is as below:

week    count1  count 2 count 3
51      2866    2866    0
50      3211    3211    0
49      6377    6377    0
48      9013    9013    0
47      5950    5950    0
newbie93
  • 3
  • 2
  • Welcome to SO. Please see [Why should I provide an MCRE for what seems to me to be a very simple SQL query](http://meta.stackoverflow.com/questions/333952/why-should-i-provide-an-mcve-for-what-seems-to-me-to-be-a-very-simple-sql-query) – Strawberry Dec 20 '20 at 18:03
  • You have two definitions of ageing in your question. You start with age > 30 days, then your examples are for age > 7 days. Which is correct? – Adrian Klaver Dec 20 '20 at 18:09
  • @AdrianKlaver My actual use case is above 30 days, Steamlined question to ageing more than 7 days to avoid churn. – newbie93 Dec 20 '20 at 18:15
  • @a_horse_with_no_name postgres – newbie93 Dec 20 '20 at 18:16
  • @a_horse_with_no_name my apologies, correction this is redshift not postgres. – newbie93 Dec 20 '20 at 18:44

1 Answers1

0

One option uses lateral joins. It is probably more efficient to aggregate the calendar table by weeks first, then perform the searches on week per week in the dataset.

Assuming Postgres (since there is no TO_DATE() in MySQL):

select d.cal_date, c1.*, c2.*
from (
    select calendar_week_number, min(cal_date) as cal_date
    rbsrpt.dim_rbs_time t
    group by calendar_week_number
) t
cross join lateral (
    select count(*) as cnt
    from rbsrpt.rbs_daily_asin_proc_snpsht r
    where r.completion_date >= t.cal_date
      and r.completion_date <  t.cal_date + interval '7 day'
) c1
cross join lateral (
    select count(*) as cnt_aged
    from rbsrpt.rbs_daily_asin_proc_snpsht r
    where r.completion_date >= t.cal_date - interval '7' day
      and r.completion_date <  t.cal_date
) c2

This ages out records after 7 days. If you wanted 30 days instead, you would change the where clause of the second subquery:

cross join lateral (
    select count(*) as cnt_aged
    from rbsrpt.rbs_daily_asin_proc_snpsht r
    where r.completion_date >= t.cal_date - interval '30 day'
      and r.completion_date <  t.cal_date - interval '23 day'
) c2

Edit: if your database does not support lateral joins, you can use subqueries instead:

select d.cal_date, 
    (
        select count(*) 
        from rbsrpt.rbs_daily_asin_proc_snpsht r
        where r.completion_date >= t.cal_date
          and r.completion_date <  t.cal_date + interval '7 day'
    ) as cnt,
    (
        select count(*) 
        from rbsrpt.rbs_daily_asin_proc_snpsht r
        where r.completion_date >= t.cal_date - interval '7' day
          and r.completion_date <  t.cal_date
    ) as cnt_aged
from (
    select calendar_week_number, min(cal_date) as cal_date
    rbsrpt.dim_rbs_time t
    group by calendar_week_number
) t
GMB
  • 216,147
  • 25
  • 84
  • 135
  • Thanks for the help, I tried, I should have mentioned earlier this sql is based on amazon redshift. lateral join statements are not executed. – newbie93 Dec 20 '20 at 18:46