1

Let's say I have the following dataset and query:

WITH results as (
    SELECT 'DAL' as team, 2010 as season, 6 as wins union
    SELECT 'DET' as team, 2010 as season, 6 as wins union
    SELECT 'DET' as team, 2011 as season, 10 as wins union
    SELECT 'DET' as team, 2012 as season, 4 as wins union
    SELECT 'DET' as team, 2013 as season, 7 as wins union
    SELECT 'DET' as team, 2014 as season, 11 as wins union
    SELECT 'DET' as team, 2015 as season, 7 as wins union
    SELECT 'DET' as team, 2016 as season, 9 as wins union
    SELECT 'DET' as team, 2017 as season, 9 as wins union
    SELECT 'DET' as team, 2018 as season, 6 as wins union
    SELECT 'DET' as team, 2019 as season, 3 as wins
) SELECT team, season, wins,
    AVG(wins) OVER (partition by team order by season rows between 2 preceding and current row) AS avg_wins_last_two_seasons
from results;

This will return the following results:

# team, season, wins, avg_wins_last_two_seasons
DAL, 2010, 6, 6.0000
DET, 2010, 6, 6.0000
DET, 2011, 10, 8.0000

However, if the size of the window isn't 'complete' - i.e., there aren't two preceding rows -- then I would like the results to be NULL instead of averaging over the decreased window size -- for example, for 2010 it calculates the avg as [6]/1 = 6 but I want it to calculate as: [NULL, NULL, 6] / 3 = NULL. How would I do this?

GMB
  • 216,147
  • 25
  • 84
  • 135
David542
  • 104,438
  • 178
  • 489
  • 842

2 Answers2

1

You can use row_number() and a case expression:

select team, season, wins,
    case when row_number() over(partition by team order by season) > 2
        then avg(wins) over (
            partition by team 
            order by season 
            rows between 2 preceding and current row
        ) 
    end as avg_wins_last_two_seasons
from results;

This ensures that there are least two preceding rows before computing thte window average; if there is not, then this returns null instead.

GMB
  • 216,147
  • 25
  • 84
  • 135
  • thanks, one question here: why do you use the `ROW_NUMBER()` instead of `COUNT()` or are the two the same in the above case? Does my answer look ok as well? – David542 Nov 22 '20 at 23:48
  • 1
    @David542: yes, `row_number()` and `count()` behave indenticaly in this situation (`row_number()` is a count in essence anyway). You don't need the window frame for `row_number()` (or `count()`) though; you just need to know if there are at least two preceding rows. – GMB Nov 23 '20 at 00:38
  • thanks for the feedback. When you say "You don't need the window frame for..." -- do you mean that we can use the default frame which is `unbounded preceding to current row` ? so we can leave it empty and it's implied, or do you mean something else? Thanks again for the feedback! – David542 Nov 23 '20 at 00:40
  • 1
    @David542: yes the default window frame is good enough for `row_number()`, as shown in my answer. – GMB Nov 23 '20 at 00:42
0

@GMB has the accepted answer here, but here is a slight variation using a named window for readability and showing both options (null and allowing incomplete window size):

WITH results as (
    SELECT 'DAL' as team, 2010 as season, 6 as wins union
    SELECT 'DET' as team, 2010 as season, 6 as wins union
    SELECT 'DET' as team, 2011 as season, 10 as wins union
    SELECT 'DET' as team, 2012 as season, 4 as wins union
    SELECT 'DET' as team, 2013 as season, 7 as wins union
    SELECT 'DET' as team, 2014 as season, 11 as wins union
    SELECT 'DET' as team, 2015 as season, 7 as wins union
    SELECT 'DET' as team, 2016 as season, 9 as wins union
    SELECT 'DET' as team, 2017 as season, 9 as wins union
    SELECT 'DET' as team, 2018 as season, 6 as wins union
    SELECT 'DET' as team, 2019 as season, 3 as wins
) SELECT team, season, wins
    ,ROUND(AVG(wins) OVER trailing_2, 1) AS avg_wins_trailing_2
    ,IF(COUNT(1) OVER trailing_2 < 3, NULL, ROUND(AVG(wins) OVER trailing_2,1))AS avg_wins_trailing_2_if_full_window
FROM results
WINDOW trailing_2 AS (
    PARTITION BY team
    ORDER BY season
    ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
);

# team, season, wins, avg_wins_trailing_2, avg_wins_trailing_2_if_full_window
DAL, 2010, 6, 6.0, null
DET, 2010, 6, 6.0, null
DET, 2011, 10, 8.0, null
DET, 2012, 4, 6.7, 6.7
DET, 2013, 7, 7.0, 7.0
David542
  • 104,438
  • 178
  • 489
  • 842