I'm trying to do a sort of running total with a variable window size. I want to sum a value from the current row to the last time another row met a condition.
Mock Data: I want a formula for the last column
ID | condition_col | col_to_sum | running_total_since_last_true |
---|---|---|---|
01 | TRUE | 10 | 10 |
02 | FALSE | 8 | 18 |
03 | FALSE | 12 | 30 |
04 | TRUE | 5 | 5 |
05 | FALSE | 8 | 13 |
06 | FALSE | 10 | 23 |
07 | FALSE | -2 | 21 |
08 | TRUE | 1 | 1 |
09 | FALSE | 1 | 2 |
What I have tried:
select ID
, condition_col
, col_to_sum
, (
select sum(col_to_sum) as total
from tbl t1
where t1.ID between tbl.ID and (select top 1 t2.ID
from tbl t2
where condition_col = TRUE
order by t2.ID))
) as running_total_since_last_true
from tbl
Because of the size of the dataset and the other things going on in the query (not shown here) this resulted in the query taking several minutes to run for a few thousand rows.
Now I'm looking at using sum() over(rows between current row and (select...)
I feel like I'm on the right track, but I keep getting stuck.
I also tried lag() but I couldn't make the lag reference itself properly.
Any ideas?