1

Currently in the old system we have, the code generates a sequence of dates, and then that sequence is joined against the inner query it's created from. What this ends up with (the eventual result), is a table that returns information about a specific date as if it were a date in the past, e.g. we had a value of X for 2/2/2022 on 1/2/2022, and then we had a value of X+3 for 2/2/2022 on 1/14/2022. So overtime we can see that whatever X is is growing.

We are currently migrating from a legacy system to Snowflake. I am trying to re-write this in Snowflake. The issue I am having is that the sequence function called in the old system is not one that works in Snowflake, and I need to figure out an equivalent method. I have tried reading about Snowflake sequences, but I don't quite understand how to write it in a way that achieves this same result.

Any help appreciated!

Here is a snippet of the old code followed by code I've tried:

OLD CODE:

--this thing is inside an inner query called inner_query
sequence(
date_add('day', -182, a_beginning_date_formated_as_a_date),
date_add('day', 40, an_end_date_formated_as_a_date)
) as ranges

--this cross join joins that inner query to the ranges inside the inner query
cross join unnest(ranges) as t (range)

-
-
-

NEW CODE: 

--I tried to get lucky by copying the old code verbatem 
create sequence ranges(
date_add('day', -182, a_beginning_date_formated_as_a_date),
date_add('day', 40, an_end_date_formated_as_a_date))

--I tried this code that I found on StackOverflow
(select -1 + row_number() over(order by 0) i, start_date + i generated_date
from (select dateadd(day,-182,start_date)::date start_date, dateadd(day,40,end_date)::date end_date)
join table(generator(rowcount => 10000 )) x qualify i < 1 + end_date - start_date) as ranges,
  • I think it would be much easier to help you if your question was geared towards reproducing a representative result set in Snowflake as opposed to migrating a piece of code to Snowflake. So if you could include a sample output of the legacy script before and after applying the sequence, that would be helpful. – Radagast Apr 26 '22 at 16:23
  • Hey Phil! I'm not sure how best to format comments here, but the gist of the output would be a column of t-minus (t-150, t-149, t-148... t+5, t+6...etc), and columns of useful KPIs, say total tickets sold, so you'd be able to look back and see the total tickets for a June 15th show, on May 6th, May 7th...etc. to see the trend in demand for that show. Does that make sense? – this_is_my_idk_account Apr 27 '22 at 17:21

1 Answers1

1

Not sure if this is what you are looking for, here is my best guess:

with 
t1 as (select current_date() - 182 as start_date),
t2 as (select current_date() + 40 as end_date),
t3 as (select row_number() over(order by 0) as i from table(generator(rowcount => 10000)))
select i, start_date + i - 1 as generated_date
from t3 cross join t1 cross join t2
where generated_date <= end_date;

The first query t1 calculates the start date (using current_date() here but assuming you will have a variable for a_beginning_date_formated_as_a_date). The second query t2 calculates the end date (again using current_date() instead of an_end_date_formated_as_a_date). The third query t3 generates 10000 rows and returns a column named i which is the sequence number from 1 to 10000. Finally, t3 is cross joined with t1 and t2, using i (the sequence number) to add days to the start date and finishing using a where condition to exclude rows that go beyond the end date.

The above query can be rewritten more concisely, without t1 and t2, like so:

with t3 as 
(select row_number() over(order by 0) as i from table(generator(rowcount => 10000)))
select i, (current_date() - 182) + i - 1 as generated_date
from t3
where generated_date <= current_date() + 40;
Maja F.
  • 353
  • 2
  • 9