3

I have a very specific problem in T-SQL.

If I can solve this example case I give you I think I will be able to solve my original case.

Having this data in a table:

DECLARE @Test TABLE
(
    Value INT
    ,Date DATETIME2(7)
);

INSERT INTO @Test
VALUES
(NULL, '2011-01-01 10:00'),
(NULL, '2011-01-01 11:00'),
(2, '2011-01-01 12:00'),
(NULL, '2011-01-01 13:00'),
(3, '2011-01-01 14:00'),
(NULL, '2011-01-01 15:00'),
(NULL, '2011-01-01 16:00'),
(4, '2011-01-01 17:00'),
(NULL, '2011-01-01 18:00'),
(5, '2011-01-01 19:00'),
(6, '2011-01-01 20:00')

I need to select this output:

Value   Date
2       2011-01-01 10:00
2       2011-01-01 11:00
2       2011-01-01 12:00
2       2011-01-01 13:00
3       2011-01-01 14:00
3       2011-01-01 15:00
3       2011-01-01 16:00
4       2011-01-01 17:00
4       2011-01-01 18:00
5       2011-01-01 19:00
6       2011-01-01 20:00

To give some explanation. If value is NULL somewhere I need to update with the value from the previous hour. If there are several null values in a row the closest earlier hour with a non null value propagates and fills all these null values. Also if the first hour of the day is null then the earliest hour on the day with a non null value propagates downwards like 2 in this case. In your case you can assume that at least one value is non null value.

My ambition is to solve this with Common table expressions or something similar. With the cursor way I think I would have the solution in short bit of time if I try but my attempts with CTEs and recursive CTEs have failed so far.

John
  • 2,043
  • 5
  • 28
  • 49
  • what have you tried? It is helpful to post what you have done so far - you might be close and need just a few suggestions to correct your existing query. – Taryn May 18 '12 at 14:29
  • Since I don't know how many null values there are in a row a single join on previous hour is not enough. There for I thought maybe I could solve it with recursion some way. I have only used CTE recursion the classic hierarchy way but my idea is that if I can update one null value each time in the recursion maybe I could fill them all. Actually I lost it pretty early when trying this idea.. So I don't believe there is much that will be of any help. – John May 18 '12 at 14:34
  • In your example, the `Value`s are non-decreasing as `Date` increases. Is this coincidence, or will it always be the case? – AakashM May 18 '12 at 14:51
  • +1 for actually including a "starting point" of a working table and data inserting code – KM. May 18 '12 at 17:34

2 Answers2

3

Since your condition is not always the same this is a little bit more difficult. In your example, the first two rows need to get their values from the first value with a later date, in the other cases they need to get the values from previous dates. If you would always need to look previous dates, you could simple do this query:

SELECT  B.Value,
        A.[Date]
FROM @Test A
OUTER APPLY (SELECT TOP 1 *
             FROM @Test
             WHERE [Date] <= A.[Date] AND Value IS NOT NULL
             ORDER BY [Date] DESC) B

But in your case, I think that you need this instead:

SELECT  ISNULL(B.Value,C.Value) Value,
        A.[Date]
FROM @Test A
OUTER APPLY (SELECT TOP 1 *
             FROM @Test
             WHERE [Date] <= A.[Date] AND Value IS NOT NULL
             ORDER BY [Date] DESC) B
OUTER APPLY (SELECT TOP 1 *
             FROM @Test
             WHERE Value IS NOT NULL
             ORDER BY [Date]) C
Lamak
  • 69,480
  • 12
  • 108
  • 116
  • I was a little bit to quick when I accepted the other answer actually. It was only working with my sample data which was increasing every hour. This is the correct answer, and it gives me the right results. But I have terrible performance problems with my real data. Any suggestions what you could possibly do to increase the performance a little bit? – John May 25 '12 at 07:39
  • @John - Yeah, its definitely gonna have bad performance since you are going through the table `test` at least twice. You know, the second `OUTER APPLY` is only there for when the first values are `NULL`, maybe you could avoid that on a better way. And I'm assuming that you at least have an index on the column `Date`. – Lamak May 25 '12 at 13:24
2

try this:

select 
    t.value, t.date
      ,COALESCE(t.value
               ,(select MAX(tt.value) from @Test tt WHERE t.Date>tt.Date)
               ,(SELECT MIN(ttt.Value) FROM @Test ttt Where ttt.Date IS NOT NULL)
               ) AS UseValue
    from @Test   t

OUTPUT:

value       date                    UseValue
----------- ----------------------- -----------
NULL        2011-01-01 10:00:00.000 2
NULL        2011-01-01 11:00:00.000 2
2           2011-01-01 12:00:00.000 2
NULL        2011-01-01 13:00:00.000 2
3           2011-01-01 14:00:00.000 3
NULL        2011-01-01 15:00:00.000 3
NULL        2011-01-01 16:00:00.000 3
4           2011-01-01 17:00:00.000 4
NULL        2011-01-01 18:00:00.000 4
5           2011-01-01 19:00:00.000 5
6           2011-01-01 20:00:00.000 6
KM.
  • 101,727
  • 34
  • 178
  • 212
  • I accepted this answer at first but I actually went with the other solution above since I thought both were working. But now when I had some performance problems with the other solution I started to try this one out and found out it does not work. It only works in this example case because value is increasing every hour. Try to shuffle the values around a bit and see what happens :) – John May 25 '12 at 07:41
  • @John, if only you had a proper PK! – KM. May 29 '12 at 12:32