By using window functions in Big Query, I want to calculate the average of a column based on some conditions. I use a window function in stead of a join because my actual dataset is very large and a window function speeds up my query.
I manage to take the average within a window, however I want to be able to exclude the current row and the rows in which the ID is equal to the current row from this average.
create temp table base (Date date, Id string, Input int64, Output int64);
insert base
values('2020-12-05', '123', 4, 12),
('2020-12-06', '456', 3, 20),
('2020-12-07', '657', 4, 45),
('2020-12-08', '789', 4, 25),
('2020-12-07', '789', 4, 35)
;
select *,
avg(Output) over (ReferenceWindow) OutputAvg,
from base
window ReferenceWindow as (partition by Input
order by unix_date(Date)
range between 1 preceding and 1 following
)
order by Id
;
What I get now:
OutputAvg
12
32.5
32.5
30
30
What I want:
OutputAvg
null
null
30
45
45