2

I'm attempting to create a recursive CTE statement that adds blank rows in between data points that will later for interpolation. I'm a beginner with SQL and this is my first time using CTE's and am having some difficulty finding the proper way to do this.

I've attempted a few different slight variations on the code I have provided below after some research but haven't grasped a good enough understanding to see my issue yet. The following code should simulate sparse sampling by taking a observation every 4 hours from the sample data set and the second portion should add rows with there respective x values every 0.1 of an hour which will later be filled with interpolated values derived from a cubic spline.

--Sample Data

create table #temperatures (hour integer, temperature double precision);
insert into #temperatures (hour, temperature) values
(0,18.5),
(1,16.9),
(2,15.3),
(3,14.1),
(4,13.8),
(5,14.7),
(6,14.7),
(7,13.5),
(8,12.2),
(9,11.4),
(10,10.9),
(11,10.5),
(12,12.3),
(13,16.4),
(14,22.3),
(15,27.2),
(16,31.1),
(17,34),
(18,35.6),
(19,33.1),
(20,25.1),
(21,21.3),
(22,22.3),
(23,20.3),
(24,18.4),
(25,16.8),
(26,15.6),
(27,15.4),
(28,14.7),
(29,14.1),
(30,14.2),
(31,14),
(32,13.9),
(33,13.9),
(34,13.6),
(35,13.1),
(36,15),
(37,18.2),
(38,21.8),
(39,24.1),
(40,25.7),
(41,29.9),
(42,28.9),
(43,31.7),
(44,29.4),
(45,30.7),
(46,29.9),
(47,27);

--1

WITH xy (x,y)
AS 
    (
  SELECT  TOP 12
    CAST(hour AS double precision) AS x
    ,temperature AS y 
    FROM #temperatures 
    WHERE cast(hour as integer) % 4 = 0
   )

Select x,y
INTO #xy
FROM xy

Select [x] As [x_input]
INTO #x_series
FROM #xy

--2

    with recursive
  , x_series(input_x) as (
    select
      min(x)
    from
      #xy
    union all
    select
      input_x + 0.1
    from
      x_series
    where
      input_x + 0.1 < (select max(x) from x)
  )
  , x_coordinate as (
  select
    input_x
    , max(x) over(order by input_x) as previous_x
  from
    x_series
  left join
    #xy on abs(x_series.input_x - xy.x) < 0.001
  )

The first CTE works as expected and produces a list of 12 (a sample every 4 hours for two days) but the second produces syntax error. The expected out put would be something like

(4,13.8), (4.1,null/0), (4.2,null/0),....., (8,12.2)
Lyric
  • 43
  • 1
  • 5
  • 1
    Sorry Juan I can provide the sample data and the expected output, what are you meaning when you say db schema? – Lyric Aug 15 '19 at 14:32
  • The `CREATE TABLE` for your tables, so we know datatype, primary key, index – Juan Carlos Oropeza Aug 15 '19 at 14:33
  • https://en.wikipedia.org/wiki/Database_schema – Renat Aug 15 '19 at 14:33
  • Made those adjustments thanks for the tips. – Lyric Aug 15 '19 at 14:43
  • You should include what is the error. – Juan Carlos Oropeza Aug 15 '19 at 14:53
  • Within the scope of the question, the objects from your final `SELECT` statement, `test` and `x-coordinate`, don't exist. But even if they did, that `SELECT` is wonky. Is that an old school comma join? What does `x_coordinate AS...` mean? Sort of everything after 'SELECT * FROM test` looks a little suspect. – Eric Brandt Aug 15 '19 at 15:01
  • That output is from query #1 or #2? because looks like is from #1 and you already said that is working. We need what result you want for #2 – Juan Carlos Oropeza Aug 15 '19 at 15:06
  • I want the result from #2 to produce a result set with all the x and y values from the sampling in addition to having rows with values at every 0.1 step but empty y values that will be filled in later. So I have a x and y value at (4,13.8) and then rows at x = 4.1,4.2....7.9 and then have another full row at x = 8 with y =12.2. I'm using this post as the structure for my example. [link](https://www.periscopedata.com/blog/spline-interpolation-in-sql) – Lyric Aug 15 '19 at 15:11
  • 1
    just to be clear the sequence is `4.1,4.2,4.3 ... 4.9,5.0,5.1, .... 7.9,8.0` ? – Juan Carlos Oropeza Aug 15 '19 at 15:13
  • Yes that is correct with y values only being populated at 4.0 and 8.0 ... 12.0 ... 16.0 – Lyric Aug 15 '19 at 15:15

3 Answers3

1

I don't think you need a recursive CTE here. I think a solution like this would be a better approach. Modify accordingly.

DECLARE @max_value FLOAT = 
    (SELECT MAX(hour) FROM  #temperatures) * 10

INSERT INTO #temperatures (hour, temperature)
SELECT X.N / 10, NULL
FROM (
    select CAST(ROW_NUMBER() over(order by t1.number) AS FLOAT) AS N
    from   master..spt_values t1 
           cross join master..spt_values t2
) X
WHERE X.N <= @max_value
    AND X.N NOT IN (SELECT hour FROM #temperatures)
Robert Sievers
  • 1,277
  • 10
  • 15
1

I dont think you need recursive.

What about this:

SQL DEMO

SELECT DISTINCT n = number *1.0 /10 , #xy.x, #xy.y
FROM master..[spt_values] step
LEFT JOIN #xy
  ON step.number*1.0 /10  = #xy.x
WHERE number BETWEEN 40 AND 480

This 480 is based on the two days you mention.

OUTPUT

enter image description here

You dont even need the temporal table

SELECT DISTINCT n = number *1.0 /10 , #temperatures.temperature
FROM master..[spt_values] step
LEFT JOIN #temperatures
  ON step.number *1.0 / 10  = #temperatures.hour
 AND  #temperatures.hour % 4 = 0
WHERE number BETWEEN 40 AND 480;
Juan Carlos Oropeza
  • 47,252
  • 12
  • 78
  • 118
0

Use the temp table #xy produced in --1 you have, the following will give you a x series:

;with x_series(input_x)
as
(
    select min(x) AS input_x
    from #xy
    union all
    select input_x + 0.1
    from x_series
    where input_x + 0.1 < (select max(x) from #xy)
)
SELECT * FROM x_series;
PeterHe
  • 2,766
  • 1
  • 8
  • 7