I want to use Analytics SQL to combine all windowed values of column in the input stream into an array in the output stream.
Say I have a data coming in that looks like this:
event | product_id | product_price |
---|---|---|
page_view | 1111 | 3.99 |
page_view | 2222 | 10.99 |
The following SQL would give me output that shows for each 30 minute window, how many products were viewed and what the highest price among them was.
CREATE OR REPLACE STREAM "DESTINATION_SQL_STREAM" (
"event_type" varchar(32),
"count_products_viewed" integer,
"max_price_viewed" decimal(7,2)
);
CREATE OR REPLACE PUMP "STREAM_PUMP" AS
INSERT INTO "DESTINATION_SQL_STREAM"
SELECT STREAM
"stream_1"."event",
COUNT("stream_1"."product_id") as "count_products_viewed,
MAX("stream_1"."product_price") as "max_price_viewed"
FROM "SOURCE_SQL_STREAM_001" as "stream_1"
WINDOWED BY STAGGER (
PARTITION BY
"stream_1"."event"
RANGE INTERVAL '30' MINUTE
);
So the output data would like something like
event | count_products_viewed | max_price_viewed |
---|---|---|
page_view | 2 | 10.99 |
If I wanted to add another column so that it looked like
event | count_products_viewed | max_price_viewed | list_products_viewed |
---|---|---|---|
page_view | 2 | 10.99 | [1111, 2222] |
What would I need to do? Tried combing through the docs but didn't find anything for working with all of a column's values during a window.
It doesn't need to be a literal array - I'm fine with a json string of the same data.
Thanks!