1

This is simplified question for more complicated one posted here:

Recursive SQL statement (PostgreSQL 9.1.4)

Simplified question

Given you have upper triangular matrix stored in 3 columns (RowIndex, ColumnIndex, MatrixValue):

   ColumnIndex       
    1   2   3   4   5
1   2   2   3   3   4
2   4   4   5   6   X
3   3   2   2   X   X
4   2   1   X   X   X
5   1   X   X   X   X

X values are to be calculated using the following algorithm:

M[i,j] = (M[i-1,j]+M[i,j-1])/2
(i= rows, j = columns, M=matrix)

Example:
M[3,4] = (M[2,4]+M[3,3])/2
M[3,5] = (m[2,5]+M[3,4])/2

The full required result is:

   ColumnIndex       
    1   2   3    4      5
1   2   2   3    3      4
2   4   4   5    6      5
3   3   2   2    4      4.5
4   2   1   1.5  2.75   3.625
5   1   1   1.25 2.00   2.8125

Sample data:

create table matrix_data (
    RowIndex integer,
    ColumnIndex integer,
    MatrixValue numeric);

    insert into matrix_data values (1,1,2);
    insert into matrix_data values (1,2,2);
    insert into matrix_data values (1,3,3);
    insert into matrix_data values (1,4,3);
    insert into matrix_data values (1,5,4);
    insert into matrix_data values (2,1,4);
    insert into matrix_data values (2,2,4);
    insert into matrix_data values (2,3,5);
    insert into matrix_data values (2,4,6);
    insert into matrix_data values (3,1,3);
    insert into matrix_data values (3,2,2);
    insert into matrix_data values (3,3,2);
    insert into matrix_data values (4,1,2);
    insert into matrix_data values (4,2,1);
    insert into matrix_data values (5,1,1);

Can this be done?

Community
  • 1
  • 1
Tomas Greif
  • 21,685
  • 23
  • 106
  • 155

4 Answers4

2

Test setup:

CREATE TEMP TABLE matrix (
    rowindex integer,
    columnindex integer,
    matrixvalue numeric);

INSERT INTO matrix VALUES
 (1,1,2),(1,2,2),(1,3,3),(1,4,3),(1,5,4)
,(2,1,4),(2,2,4),(2,3,5),(2,4,6)
,(3,1,3),(3,2,2),(3,3,2)
,(4,1,2),(4,2,1)
,(5,1,1);

Run INSERTs in a LOOP with DO:

DO $$
BEGIN

FOR i IN 2 .. 5 LOOP
   FOR j IN 7-i .. 5 LOOP
      INSERT INTO matrix
      VALUES (i,j, (
         SELECT sum(matrixvalue)/2
         FROM   matrix
         WHERE  (rowindex, columnindex) IN ((i-1, j),(i, j-1))
         ));
   END LOOP;
END LOOP;

END;
$$

See result:

SELECT * FROM matrix order BY 1,2;
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
1

This can be done in a single SQL select statement, but only because recursion is not necessary. I'll outline the solution. If you actually want the SQL code, let me know.

First, notice that the only items that contribute to the sums are along the diagonal. Now, if we follow the contribution of the value "4" in (1, 5), it contributes 4/2 to (2,5) and 4/4 to (3,5) and 4/8 to (4,5). Each time, the contribution is cut in half, because (a+b)/2 is (a/2 + b/2).

When we extend this, we start to see a pattern similar to Pascal's triangle. In fact, for any given point in the lower triangular matrix (below where you have values), you can find the diagonal elements that contribute to the value. Extend a vertical line up to hit the diagonal and a horizontal line to hit the diagonal. Those are the contributors from the diagonal row.

How much do they contribute? Well, for that we can go to Pascal's triangle. For the first diagonal below where we have values, the contributions are (1,1)/2. For the second diagonal, (1,2,1)/4. For the third, (1,3,3,1)/8 . . . and so on.

Fortunately, we can calculate the contributions for each value using a formula (the "choose" function from combinatorics). The power of 2 is easy. And, determining how far a given cell is from the diagonal is not too hard.

All of this can be combined into a single Postgres SQL statement. However, @Erwin's solution also works. I only want to put the effort into debugging the statement if his solution doesn't meet your needs.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
1

... and here comes the recursive CTE with multiple embedded CTE's (tm):

DROP SCHEMA tmp CASCADE;
CREATE SCHEMA tmp ;
SET search_path=tmp;

CREATE TABLE matrix_data (
    yyy integer,
    xxx integer,
    val numeric);

    insert into matrix_data (yyy,xxx,val) values
      (1,1,2) , (1,2,2) , (1,3,3) , (1,4,3) , (1,5,4)
    , (2,1,4) , (2,2,4) , (2,3,5) , (2,4,6)
    , (3,1,3) , (3,2,2) , (3,3,2)
    , (4,1,2) , (4,2,1)
    , (5,1,1)
        ;

WITH RECURSIVE rr AS (
        WITH xx AS (
                SELECT MIN(xxx) AS x0
                , MAX(xxx) AS x1
                FROM matrix_data
                )
        , mimax AS (
                SELECT generate_series(xx.x0,xx.x1) AS xxx
                FROM xx
                )
        , yy AS (
                SELECT MIN(yyy) AS y0
                , MAX(yyy) AS y1
                FROM matrix_data
                )
        , mimay AS (
                SELECT generate_series(yy.y0,yy.y1) AS yyy
                FROM yy
                )
        , cart AS (
                SELECT * FROM mimax mm
                JOIN mimay my ON (1=1)
                )
        , empty AS (
                SELECT * FROM cart ca
                WHERE NOT EXISTS (
                        SELECT *
                        FROM matrix_data nx
                        WHERE nx.xxx = ca.xxx
                        AND nx.yyy = ca.yyy
                        )
                )
        , hot AS (
                SELECT * FROM empty emp
                WHERE EXISTS (
                        SELECT *
                        FROM matrix_data ex
                        WHERE ex.xxx = emp.xxx -1
                        AND ex.yyy = emp.yyy
                        )
                AND EXISTS (
                        SELECT *
                        FROM matrix_data ex
                        WHERE ex.xxx = emp.xxx
                        AND ex.yyy = emp.yyy -1
                        )
                    )
        -- UPDATE from here:
        SELECT h.xxx,h.yyy, md.val / 2 AS val
        FROM hot h
        JOIN matrix_data md ON
                (md.yyy = h.yyy AND md.xxx = h.xxx-1)
                OR (md.yyy = h.yyy-1 AND md.xxx = h.xxx)
        UNION ALL
        SELECT e.xxx,e.yyy, r.val / 2 AS val
        FROM empty e
        JOIN rr r ON ( e.xxx = r.xxx+1 AND e.yyy = r.yyy)
                OR ( e.xxx = r.xxx AND e.yyy = r.yyy+1 )
        )
INSERT INTO matrix_data(yyy,xxx,val)
SELECT DISTINCT yyy,xxx
        ,SUM(val)
FROM rr
GROUP BY yyy,xxx
        ;

SELECT * FROM matrix_data
        ;

New result:

NOTICE:  drop cascades to table tmp.matrix_data
DROP SCHEMA
CREATE SCHEMA
SET
CREATE TABLE
INSERT 0 15
INSERT 0 10
 yyy | xxx |          val           
-----+-----+------------------------
   1 |   1 |                      2
   1 |   2 |                      2
   1 |   3 |                      3
   1 |   4 |                      3
   1 |   5 |                      4
   2 |   1 |                      4
   2 |   2 |                      4
   2 |   3 |                      5
   2 |   4 |                      6
   3 |   1 |                      3
   3 |   2 |                      2
   3 |   3 |                      2
   4 |   1 |                      2
   4 |   2 |                      1
   5 |   1 |                      1
   2 |   5 |     5.0000000000000000
   5 |   5 | 2.81250000000000000000
   4 |   3 | 1.50000000000000000000
   3 |   5 | 4.50000000000000000000
   5 |   2 | 1.00000000000000000000
   3 |   4 | 4.00000000000000000000
   5 |   3 | 1.25000000000000000000
   4 |   5 | 3.62500000000000000000
   4 |   4 | 2.75000000000000000000
   5 |   4 | 2.00000000000000000000
(25 rows)
wildplasser
  • 43,142
  • 8
  • 66
  • 109
  • Wow, this if recursive CTEism in its terminal state! :) If you compare the ~ 15 lines of code in my solution, this seems rather monstrous. BTW, the result is not right. – Erwin Brandstetter Jul 19 '12 at 18:59
  • I just hate imperative, procedural code, I had to do something! But it is still not right... Cannot reference recursive CTE twice+ cannot use aggregates. I'm stuck. Nice try, though ;-] BTW: the other way is to first build a pascal's triangle, and then update from that. Order is important. – wildplasser Jul 19 '12 at 19:03
  • Got it. The point was pulling the sum into the outer query. Do I get the CTE-award, now? – wildplasser Jul 19 '12 at 19:16
  • I guess you deserve *some* kind of medal. :) +1 for tenacity! – Erwin Brandstetter Jul 19 '12 at 19:28
  • You could probably pull all CTEs up through `cart` (and `empty`, probably) out of the recursive CTE. Also, maybe consider doing nested `SELECT`s for `xx` and `yy` into their series generators. And does postGre not have a `CROSS JOIN` clause (for `cart`)? You could potentially roll up to that point too. Personally, I'd probably prefer a recursive query, given the nature of the problem. – Clockwork-Muse Jul 19 '12 at 19:51
  • Yes for the ease of building it up, I kept it more or less linear. Only 'hot' and 'empty' are referenced from the outside, IIRC. The rest could be pushed downwards, but that would be harder to read. IMHO `ON (1=1)` is a very explicit way to express a cartesian product. – wildplasser Jul 19 '12 at 19:56
0
 while (select max(ColumnIndex+RowIndex) from matrix_data)<10
 begin
      insert matrix_data
      select c1.RowIndex, c1.ColumnIndex+1, (c1.MatrixValue+c2.MatrixValue)/2 
      from matrix_data c1
           inner join
           matrix_data c2 
           on c1.ColumnIndex+1=c2.ColumnIndex and c1.RowIndex-1 = c2.RowIndex   
      where c1.RowIndex+c1.ColumnIndex=(select max(RowIndex+ColumnIndex) from matrix_data) 
      and c1.ColumnIndex<5
 end
podiluska
  • 50,950
  • 7
  • 98
  • 104