0

In almost all RDMS products, we have windowing functions. Broadly, they are classified as:

  1. Ranking functions (e.g: rank(), dense_rank() etc...)
  2. Aggregate functions (e.g: sum(), avg() etc..)
  3. Analytic functions (e.g: first(), lag() etc..)

In my understanding (by reading various blogs/documents), the 'frame' (either using rows/range) is considered into account only when we use aggregate functions. while applying ranking and analytical functions, 'frame' is not considered (either the default frame or explicitly provided frame).

Is my understanding correct ?

For example: let us consider a table having: id, year and salary. We have multiple ids and under each id & we have multiple year values. All salary values for an id are distinct.

And let us consider the below cases (used I select clause):

  1. rank() over( partition by id order by year)
  2. rank() over( partition by id order by year rows between 1 preceding and current row)
  3. rank() over( partition by id order by year range between 1 preceding and current row)

All of the above versions return the exact same results.

But when we use sum(salary) and apply the same above over() clauses, they return different results.

Input table:
--------------------
Id  Year   Salary
1   2005   10000
1   2007   20000
1   2009   30000
1   2010   40000
2   2005   10000
2   2007   20000
2   2009   30000
2   2010   40000

All the below queries give same results when rank is calculated; which implies that range is NOT taken into account while deriving rank.

 select  id
       ,year
       ,salary    
       ,rank() over( partition by id order by year) as x 
 from  practice;
 
 select  id
       ,year
       ,salary    
       ,rank() over( partition by id order by year rows between 1 preceding and current row) as x 
 from  practice;

 select  id
       ,year
       ,salary    
       ,rank() over( partition by id order by year range between 1 preceding and current row) as x 
 from  practice;

Result:

Id  Year   Salary  X
1   2005   10000   1
1   2007   20000   2
1   2009   30000   3
1   2010   40000   4
2   2005   10000   1
2   2007   20000   2
2   2009   30000   3
2   2010   40000   4

When sum(salary) is given, they start giving different results.

Thanks!

user3103957
  • 636
  • 4
  • 16
  • 1
    Tag your question with the database you are using. If a particular database is ignoring the frame for any window function, I would expect it to generate a syntax error. – Gordon Linoff Feb 14 '21 at 17:30
  • Thanks Gordon for your time looking into this! I am experimenting with My-Sql actually. Now I have provided the data as well as the actual queries I use. Thanks! – user3103957 Feb 15 '21 at 02:04
  • . . Very interesting. It seems to be ignoring the window frame rather than generating an error. – Gordon Linoff Feb 15 '21 at 02:11
  • It should be a syntax error when you try to specify a window frame definition for a `rank` function. Too bad that MySQL silently ignores such errors instead of showing them. – Vladimir Baranov Feb 16 '21 at 06:02

0 Answers0