0

Imagine a sample table

CREATE TABLE mytable (myid integer, myval double precision);

I want to update myval to the previous non-zero value, ordered by myid, if myval is equal to 0.

For that to work properly, the query has to start updating from the lowest myid and end with the highest one.

I really don't know from where to start this time. The following says that window functions are not allowed in UPDATE:

UPDATE mytable
SET myval = LAG(myval) OVER (ORDER BY myid)
WHERE myval = 0
RETURNING *;

And more complex alternatives with FROM subqueries have ended in syntax errors or dumb outputs because the subquery is evaluated once instead of once per row. This last phrase makes me think about the SELECT ... LEFT JOIN LATERAL ... structure, but I haven't been able to make it work with the update statement.

Héctor
  • 399
  • 3
  • 16

2 Answers2

2

First part of the solution : as you can't call a window function inside the SET clause of an UPDATE, you can use a cte instead :

WITH list AS
(
  SELECT myval, LAG(myval) OVER (ORDER BY myid) AS new_val
    FROM mytable
   WHERE myval = 0
)
UPDATE mytable AS t
SET myval = l.new_val
FROM list AS l
WHERE t.myval = l.myval ;

Second part of the solution : for replacing the zero values with their non-zero previous value in the specified order, we can't add the clause FILTER (WHERE myval <> 0) to the lag() function because this clause is specific to the aggregate functions only (all the aggregate functions can be used as a window function by adding the OVER() clause behind, but the "pure" window functions are not aggregate functions). So here we can define our own aggregate function replaced_by_first_previous_non_zero_value() as follow :

CREATE OR REPLACE FUNCTION replaced_by_first_previous_non_zero_value(x double precision, y double precision)
RETURNS double precision LANGUAGE sql AS
$$
SELECT CASE 
         WHEN y = 0
         THEN COALESCE(x, y)
         ELSE y
       END ;
$$ ;

DROP AGGREGATE IF EXISTS replaced_by_first_previous_non_zero_value(double precision) ;
CREATE AGGREGATE replaced_by_first_previous_non_zero_value(double precision)
( sfunc = replaced_by_first_previous_non_zero_value
, stype = double precision
) ;

Then, for the following query :

SELECT myval, replaced_by_first_previous_non_zero_value(myval) OVER (ORDER BY myid RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
  FROM unnest(array[5, 0, 0 , 0 , 6] :: double precision[]) with ordinality as a(myval, myid)

we get he following result :

myval | replaced_by_first_previous_non_zero_value
5     | 5
0     | 5
0     | 5
0     | 5
6     | 6
Edouard
  • 6,577
  • 1
  • 9
  • 20
  • I'm afraid that is not that easy. With your example, a fragment of `myval` col ordered ASC by miyid like this one `[5, 0, 0 , 0 , 6]` would look `[5, 5, 0 , 0 , 6]` instead of `[5, 5, 5, 5, 6]`. Maybe isn't explained clear enough in the question. – Héctor Nov 03 '21 at 09:55
  • @hector, your question was clear enough but I missed that important point. I've updated the answer accordingly adding the `FILTER (WHERE myval <>0)` clause. Tell me if it better fit your expectation. – Edouard Nov 03 '21 at 10:22
  • I've tried to implement it but: `error: FILTER is not implemented for non-aggregate window functions` – Héctor Nov 03 '21 at 10:53
  • Yes indeed I didn't test the code as proposed. As the lag() function is a window function but not an aggregate function, I propose to create our own aggregate function, see the updated answer which is tested for this time. – Edouard Nov 03 '21 at 11:53
  • Could be @arutar answer simpler? What do you think? – Héctor Nov 03 '21 at 12:50
1
update mytable t set myval=(select s.myval from mytable s where s.myid < t.myid and s.myval!=0 order by s.myid desc limit 1) where t.myid in (select myid from mytable where myval=0 order by myid for update) ;

results

select * from mytable;
 myid | myval
------+-------
    1 |     1
    2 | 0.123
    3 |     0
    4 |     5
    7 |     0

update mytable t set myval=(select s.myval from mytable s where s.myid < t.myid and s.myval!=0 order by s.myid desc limit 1) where t.myid in (select myid from mytable where myval=0 order by myid for update) ;

select * from mytable order by myid;
 myid | myval
------+-------
    1 |     1
    2 | 0.123
    3 | 0.123
    4 |     5
    7 |     5

arutar
  • 1,015
  • 3
  • 9
  • It works! Do you think it could be generalized to n `myval`s? The last `WHERE` clause makes it a little bit difficult for me to think about a generalized solution for n columns. – Héctor Nov 03 '21 at 12:53
  • it would be work for n values `select myid from mytable where myval in (0,1,2,3,n) order by myid for update` – arutar Nov 03 '21 at 13:02
  • No, sorry. I mean for `my_col_a`, `my_col_b`, etc... – Héctor Nov 03 '21 at 13:08
  • Seems that something like this does it: `UPDATE mytable AS u SET my_var_a = ( SELECT s.my_var_a FROM mytable AS s WHERE s.myid <= u.myid AND s.my_var_a != 0 ORDER BY s.myid DESC LIMIT 1 ), my_var_b = ( SELECT s.my_var_b FROM mytable AS s WHERE s.myid <= u.myid AND s.my_var_b != 0 ORDER BY s.myid DESC LIMIT 1 ) WHERE u.myid IN ( SELECT "myid" FROM mytable WHERE my_var_a = 0 OR my_var_b = 0 ORDER BY "timestamp" ASC FOR UPDATE )`; (Sorry, comments don't allow code format) – Héctor Nov 03 '21 at 13:47