I would like to perform a window function with a filter clause, for example:
LAG("date", 1) FILTER (WHERE type='A') OVER (PARTITION BY id ORDER BY id ASC) AS "A_lag_1"
However, Postgres doesn't support this operation but I cant determine how else do it. Details below
Challange
Input tab_A
:
+----+------+------+
| id | type | date |
+----+------+------+
| 1 | A | 30 |
| 1 | A | 25 |
| 1 | A | 20 |
| 1 | B | 29 |
| 1 | B | 28 |
| 1 | B | 21 |
| 1 | C | 24 |
| 1 | C | 22 |
+----+------+------+
Desired output:
+----+------+------+---------+---------+---------+---------+---------+---------+
| id | type | date | A_lag_1 | A_lag_2 | B_lag_1 | B_lag_2 | C_lag_1 | C_lag_2 |
+----+------+------+---------+---------+---------+---------+---------+---------+
| 1 | A | 30 | 25 | 20 | 29 | 28 | 24 | 22 |
| 1 | A | 25 | 20 | | | | 24 | 22 |
| 1 | A | 20 | | | | | | |
| 1 | B | 29 | 25 | 20 | 28 | 21 | 24 | 22 |
| 1 | B | 28 | 25 | 20 | 21 | | 24 | 22 |
| 1 | B | 21 | 20 | | | | 24 | 22 |
| 1 | C | 24 | 20 | | 21 | | 22 | |
| 1 | C | 22 | 20 | | 21 | | | |
+----+------+------+---------+---------+---------+---------+---------+---------+
In words:
- For each row select all rows which occurred before it (see
date
column) - Then for each
type
('A', 'B', 'C') put the most recentdate
inA_lag_1
and the second the most recent (by date) value inA_lag_2
fortype
'A', andB_lag_1
,B_lag_2
for 'B' etc.
The above example is quite simplified, in my real use case there will be a lot more id
values, more lag column iterations A_lag_X
and types.
Posible solution
This challenge seems a perfect fit for a window function as I want to keep the same number of rows tab_A
and append information which is related to the row but in the past.
So constructing it using a window function (sqlfiddle):
SELECT
id, type, "date",
LAG("date", 1) FILTER (WHERE type='A') OVER (PARTITION BY id ORDER BY id ASC, "date" DESC) AS "A_lag_1",
LAG("date", 2) FILTER (WHERE type='A') OVER (PARTITION BY id ORDER BY id ASC, "date" DESC) AS "A_lag_2",
LAG("date", 1) FILTER (WHERE type='B') OVER (PARTITION BY id ORDER BY id ASC, "date" DESC) AS "B_lag_1",
LAG("date", 2) FILTER (WHERE type='B') OVER (PARTITION BY id ORDER BY id ASC, "date" DESC) AS "B_lag_2",
LAG("date", 1) FILTER (WHERE type='C') OVER (PARTITION BY id ORDER BY id ASC, "date" DESC) AS "C_lag_1",
LAG("date", 2) FILTER (WHERE type='C') OVER (PARTITION BY id ORDER BY id ASC, "date" DESC) AS "C_lag_2"
FROM tab_A
However, I get the following error:
ERROR: FILTER is not implemented for non-aggregate window functions Position: 30
Although this error is referenced in the documentation I cant determine another way of doing it.
Any help would be much appreciated.
Other SO questions:
- 1. This answer relies on using an aggregate function such as
max
. However, this won't work when trying to retrieve the 2nd last row, 3rd last row etc.