The formula for weighted average is:
weighted_average = sum of (value * weight) / sum of weights
Since you have timestamp, you can use that field to compute the weighted average of price based on quantity column for each product with a window frame that includes all rows within the last 500 seconds. Below is the query
Query:
with Stgquery as(
select
product, SUM(price) OVER (PARTITION BY product LIMIT DURATION (second, 500)) as denom,
sum(cast(price as float) * cast(quantity as float)) over (PARTITION by product LIMIT DURATION (second, 500)) as num
from
input)
select product,weighted_average=(num/denom) from Stgquery
In this query, CTE stgquery
selects the product
, the cumulative sum of the price
column for each group of product
values, and the sum of the product of the price
and quantity
columns for each group of product
values ( using the SUM
function with a window frame that includes all rows within the last 500 seconds. This can be changed as per requirement). The SELECT
statement selects the product
column and calculates the weighted average of the price
column based on the quantity
column for each group of product
values. The weighted average is calculated by dividing the sum of the product of the price
and quantity
columns (stored in the num
column of the Stgquery
CTE) by the cumulative sum of the price
column (stored in the denom
column of the Stgquery
CTE). The result is returned as a new column named weighted_average
.
Output:
product |
weighted_average |
1 |
100 |
2 |
200 |
2 |
200 |
1 |
204.3478 |
2 |
161.2676 |
2 |
134.9462 |
2 |
127.897 |
1 |
175.4717 |
1 |
157.971 |