2

I am having a weird result when I am trying to get the LAST_VALUE from a table in SQL Server 2012.

This is the table I have

PK | Id1 | Id2
1  | 2   | 5
2  | 2   | 6
3  | 2   | 5
4  | 2   | 6 

This is my query

SELECT
    Id1, Id2, LAST_VALUE(PK) OVER (PARTITION BY Id1 ORDER BY Id2) AS LastValue
FROM
    @Data

This is the result I am expecting

Id1 | Id2  | LastValue
2   | 5    | 3
2   | 5    | 3
2   | 6    | 4
2   | 6    | 4

This is what I am receiving

Id1 | Id2  | LastValue
2   | 5    | 3
2   | 5    | 3
2   | 6    | 2
2   | 6    | 2

Here is a demonstration of the problem
http://sqlfiddle.com/#!6/5c729/1

Is there anything wrong with my query?

Vladimir Baranov
  • 31,799
  • 5
  • 53
  • 90
CrApHeR
  • 2,595
  • 4
  • 25
  • 40
  • 5
    Why are you expecting those values?, both are valid, because the order is being done by `Id2`, so it chooses one of them – Lamak Apr 30 '15 at 15:13
  • You are right, both solutions are valid. I was expecting that order because it is the order which the data was inserted in the table. I think I am mixing concepts and I have to use MAX instead of LAST_VALUE to get what I am looking for. – CrApHeR Apr 30 '15 at 15:20
  • Because the result is not the expected. You can check the result returned with your order here http://sqlfiddle.com/#!6/5c729/9 – CrApHeR Apr 30 '15 at 15:36
  • So do you actually want the last value according to the order? Or do you want a max value? – SQLChao Apr 30 '15 at 15:39
  • In this case, I need the MAX value but I had another case where I need the last inserted value. In the case I need the last inserted value (the question apply for that case), I don't know how to change the query to do it. – CrApHeR Apr 30 '15 at 17:44

3 Answers3

3

SQL Server doesn't know or care about the order in which rows were inserted into the table. If you need specific order, always use ORDER BY. In your example ORDER BY is ambiguous, unless you include PK into the ORDER BY. Besides, LAST_VALUE function can return odd results if you are not careful - see below.

You can get your expected result using MAX or LAST_VALUE (SQLFiddle). They are equivalent in this case:

SELECT
    PK, Id1, Id2
    ,MAX(PK) OVER (PARTITION BY Id1, Id2) AS MaxValue
    ,LAST_VALUE(PK) OVER (PARTITION BY Id1, Id2 ORDER BY PK rows between unbounded preceding and unbounded following) AS LastValue
FROM
    Data
ORDER BY id1, id2, PK

Result of this query will be the same regardless of the order in which rows were originally inserted into the table. You can try to put INSERT statements in different order in the fiddle. It doesn't affect the result.

Also, LAST_VALUE behaves not quite as you'd intuitively expect with default window (when you have just ORDER BY in the OVER clause). Default window is ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW, while you'd expected it to be ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING. Here is a SO answer with a good explanation. The link to this SO answer is on MSDN page for LAST_VALUE. So, once the row window is specified explicitly in the query it returns what is needed.


If you want to know the order in which rows were inserted into the table, I think, the most simple way is to use IDENTITY. So, definition of your table would change to this:

CREATE TABLE Data 
(PK INT IDENTITY(1,1) PRIMARY KEY,
Id1 INT,
Id2 INT)

When you INSERT into this table you don't need to specify the value for PK, the server would generate it automatically. It guarantees that generated values are unique and growing (with positive increment parameter), even if you have many clients inserting into the table at the same time simultaneously. There may be gaps between generated values, but the relative order of the generated values will tell you which row was inserted after which row.

Community
  • 1
  • 1
Vladimir Baranov
  • 31,799
  • 5
  • 53
  • 90
1

It is never a good idea to rely on implicit order caused by the particular implementation of the underlying database engine.

I don't know why, running the query

SELECT * FROM @Data ORDER BY Id2

the result will be

+----+-----+-----+
| PK | id1 | id2 |
+----+-----+-----+
|  1 |   2 |   5 |
|  3 |   2 |   5 |
|  4 |   2 |   6 |
|  2 |   2 |   6 |
+----+-----+-----+

which means SQL Server decided the order of rows in a way that is different from the insert order.

That's why the LAST_VALUE behavior is different from expected, but is consistent with the SQL Server sort method.

But how SQL Server sort your data?

The best answer we have is the accepted answer of this question (from where I took the sentence in the beginning of my answer).

Community
  • 1
  • 1
Nizam
  • 4,569
  • 3
  • 43
  • 60
0
SELECT  Id1
        , Id2
        , LAST_VALUE(PK) 
            OVER (PARTITION BY Id1 
            ORDER BY Id2 DESC) AS LastValue
FROM    Data
ORDER BY Id2 ASC

Result

Id1 Id2 LastValue
2   5   3
2   5   3
2   6   4
2   6   4
Ray Krungkaew
  • 6,652
  • 1
  • 17
  • 28