9

I've made the following calculation involving LAG():

(lag(fValue,1,fValue) OVER (PARTITION BY Cluster ORDER BY iSequence) + fValue) / 2 as fValueAjusted

It takes the previous (based on iSequence) record's fValue, sums with current one, and divides it by 2.

But, instead of using fValue, I must do that using previous record's fValueAjusted.

It means that first record's fValueAjusted will be its own fValue. Second record's fValueAjusted will be based on first record's fValue. And, starting from third record, it's calculation will be based on previous record's fValueAjusted.

I need fValueAjusted to be calculated recursively over the previous record's fValueAjusted. I can't figure out how to do that.

Update 1: This is an example of source data. Real table has hundreds of records and 80 clusters.

CREATE TABLE dbo.example (
    iUnity      int NOT NULL,
    Cluster     char(2) NOT NULL,
    fValue      float NOT NULL
)

15  A1      150
17  A1      170
21  B2      210
23  B2      230
71  C3      710

This is the CTE that calculates the sequence:

WITH cteSequencing AS (
    SELECT
        iUnity,Cluster
        ,fValue as fValueOriginal
        ,row_number() OVER (PARTITION BY Cluster ORDER BY fValueOriginal) as iSequence
    FROM dbo.example
)

If fValueAjusted would be calculated based on fValueOriginal, the query would look like:

SELECT
    iUnity,Cluster,fValueOriginal
    ,(
        lag(fValue,1,fValue) OVER (PARTITION BY Cluster ORDER BY iSequence)
        + fValueOriginal
    ) / 2 as fValueAjusted
FROM cteSequencing

But fValueAjusted of one record must be calculated based on fValueAjusted of previous record. It would be something like:

SELECT
    iUnity,Cluster,fValueOriginal
    ,(
        lag(fValueAjusted,1,fValueOriginal) OVER (PARTITION BY Cluster ORDER BY iSequence)
        + fValueOriginal
    ) / 2 as fValueAjusted
FROM cteSequencing

Of course fValueAjusted isn't available when it executes. LAG() must go recursively, calculating the column for a record then providing this column for next record to use.

halfer
  • 19,824
  • 17
  • 99
  • 186
Hikari
  • 3,797
  • 12
  • 47
  • 77
  • 2
    sample data and expected output would help here. – S3S Jul 10 '17 at 16:09
  • Make an additional column (or a temp table), insert the value, then use it for calculations. Best suggestion I can do with the limited information provided. – Jacob H Jul 10 '17 at 16:29
  • I can't use a temp table because the calculation is recursive. The input of the "function" is the output of the "function" itself. If I could do it using function it'd be able to do it, but I can't figure out how. – Hikari Jul 10 '17 at 16:42
  • I'm gonna try to provide some example. – Hikari Jul 10 '17 at 16:43

1 Answers1

10

UPDATE: Original Answer was not correct

Here is the correct one:

The code uses recursive CTEs

CREATE TABLE #example (
    iUnity      int NOT NULL,
    Cluster     char(2) NOT NULL,
    fValue      float NOT NULL
)
INSERT INTO #example
VALUES
( 15,  'A1',      150 ),
( 16,  'A1',      170 ),
( 17,  'A1',      190 ),
( 18,  'A1',      210 ),
( 21,  'B2',      210 ),
( 23,  'B2',      230 ),
( 71,  'C3',      710 )

WITH cteSequencing AS (
    -- Get Values Order
    SELECT iUnity, Cluster, fValue, fValue AS fValueAjusted,
        ROW_NUMBER() OVER (PARTITION BY Cluster ORDER BY fValue) AS iSequence
    FROM #example
),
Recursion AS(
    -- Anchor - the first value in clusters
    SELECT iUnity, Cluster, fValue, fValueAjusted, iSequence
    FROM cteSequencing
    WHERE iSequence = 1
    UNION ALL
    -- Calculate next value based on the previous
    SELECT b.iUnity As iUnity, b.Cluster, b.fValue,
        ( a.fValueAjusted + b.fValue ) / 2 AS fValueAjusted,
        b.iSequence
    FROM Recursion AS a
        INNER JOIN cteSequencing AS b ON a.iSequence + 1 = b.iSequence AND a.Cluster = b.Cluster
)
SELECT * FROM Recursion ORDER BY Cluster, fValue

-- Manually check results
SELECT ( 150 + 170 ) / 2
SELECT ( 190 + 160 ) / 2 
SELECT ( 190 + 170 ) / 2

Output:

iUnity      Cluster fValue                 fValueAjusted          iSequence
----------- ------- ---------------------- ---------------------- --------------------
15          A1      150                    150                    1
16          A1      170                    160                    2
17          A1      190                    175                    3
18          A1      210                    192.5                  4
21          B2      210                    210                    1
23          B2      230                    220                    2
71          C3      710                    710                    1

Update

If you encounter

The maximum recursion 100 has been exhausted before statement completion

error, then use OPTION (MAXRECURSION xxx) to set a larger recursion limit up-to 32,767.

Alex
  • 4,885
  • 3
  • 19
  • 39
  • I'm kind of in the same boat, albeit using Presto, and am worrying that it would required recursion as well; which I currently have no means to use unfortunately. I guess there's no other way around it, NOT requiring recursion....? – Jonny Jul 27 '20 at 11:53
  • 1
    @Jonny - For this specific problem you either need recursion or an **ability** to store intermediate calculation results. AFAIK (in SQL Server) without recursion you would need to use loops/cursor loops. – Alex Jul 27 '20 at 19:46