1

I'm hoping somebody can assist me with a query, I need to build a running total that goes up and down, this is rough idea of the data set

poolname  date                     status
test1     2018-11-01 08:39:09.737  started
test1     2018-11-01 08:40:09.737  started
test1     2018-11-01 08:41:09.737  finished
test1     2018-11-01 08:42:09.737  started
test1     2018-11-01 08:44:09.737  finished
test1     2018-11-01 08:45:09.737  finished
test2     2018-11-01 08:21:09.737  started
test2     2018-11-01 08:22:09.737  started
test2     2018-11-01 08:24:09.737  finished
test2     2018-11-01 08:30:09.737  finished

This is the output I need:

poolname  date                     status     RunningTotal
test1     2018-11-01 08:39:09.737  started    1
test1     2018-11-01 08:40:09.737  started    2
test1     2018-11-01 08:41:09.737  finished   1
test1     2018-11-01 08:42:09.737  started    2
test1     2018-11-01 08:44:09.737  finished   1
test1     2018-11-01 08:45:09.737  finished   0
test2     2018-11-01 08:21:09.737  started    1
test2     2018-11-01 08:22:09.737  started    2
test2     2018-11-01 08:24:09.737  finished   1
test2     2018-11-01 08:30:09.737  finished   0

I hope this make sense, I need to keep it as efficient as possible, I've been attempting to use window functions, LEAD and LAG. Im sure there is a really easy way to do this but I've not figured it out yet.

Yen
  • 23
  • 1
  • 4
  • 2
    Don't post images of text; it's text. Post (tabular formatted) text as **`text`**. At least we can use it then. – Thom A Oct 15 '20 at 13:23
  • *"I've been attempting to use window functions, LEAD and LAG"* And what were those attempts? Why didn't they work? – Thom A Oct 15 '20 at 13:23
  • When I posted with text it didn't appear in the table format I wanted it to and looked unsightly so I added images. As you can tell by my profile I don't use stackoverflow often so I'm not familiar with best practice on formatting. I attempted SUM(date)OVER(PARTITION BY PoolName ORDER BY Date ASC ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) within a CASE statement for started and finished. I've deleted my lead and lag attempts – Yen Oct 15 '20 at 13:38
  • [Edit] your question, don't put your attempts in the comments. As for the table, have a look at [How to make tables in good format in asking questions here (which is understandable by others)?](https://meta.stackexchange.com/a/220128/397219) – Thom A Oct 15 '20 at 13:40
  • I fixed the first image for you. Please use the edit link and see how I did it to fix the second image. – Joel Coehoorn Oct 15 '20 at 16:33
  • Thanks Joel, I've updated that to match – Yen Oct 16 '20 at 15:08

1 Answers1

1

You want a cumulative sum:

select t.*,
       sum(case when status = 'start' then 1
                when status = 'finished' then -1
                else 0
           end) over (partition by poolid order by date) as as runningTotal
from t;
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • I believe you typoed a bit - the SUM needs an OVER clause to work as a cumulative (e.g., `SUM(...) OVER (ORDER BY poolname, date) as runningTotal` – seanb Oct 15 '20 at 13:39
  • 1
    Thanks for that, it didnt quite solve it but it gave me the idea to add it to my window function and it appears to be working! Thanks :) – Yen Oct 15 '20 at 13:44
  • I added OVER(PARTITION BY PoolName ORDER BY Date ASC ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) – Yen Oct 15 '20 at 13:46
  • @Yen . . . The `rows between` should not be needed. – Gordon Linoff Oct 15 '20 at 14:51
  • @Yen If you look in documentation you'll find that between have default 'RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW', so order by is sufficient if you have no identical timestamps in your data. – astentx Oct 15 '20 at 15:33
  • @GordonLinoff and others, Many Thanks for all your help :) – Yen Oct 16 '20 at 15:09