I'm sorry for a bit fuzzy title. Don't know how to describe it better.
I have table like this
CREATE TABLE foo (
id INT NOT NULL,
version INT NOT NULL,
data VARCHAR(200) NULL,
PRIMARY KEY (id, version)
);
With data like this:
INSERT INTO foo
VALUES
(1, 1, '1.1'),
(2, 1, '2.1'),
(3, 1, '3.1'),
(1, 2, '1.2'),
(2, 2, '2.2'),
(4, 1, '4.1'),
(1, 3, '1.3'),
(4, 2, '4.2');
I need to get all rows (or N rows with highest id
values. This actually would be more often) with latest available version
and unique id
. I have two queries now that seems to be working. But I'm not confident enough in SQL to decide which is better of if there is even better approach. Also I don't think that I need any additional index for this but again, suggestions are welcome. DB is PostgreSQL (9.5), I expect thousandth of rows (but it would not be 10000 in the near future probably).
Query 1:
test_db=# SELECT DISTINCT ON (id) id, version, data FROM foo ORDER BY id DESC, version DESC;
id | version | data
----+---------+------
4 | 2 | 4.2
3 | 1 | 3.1
2 | 2 | 2.2
1 | 3 | 1.3
(4 rows)
test_db=# SELECT DISTINCT ON (id) id, version, data FROM foo ORDER BY id DESC, version DESC limit 2;
id | version | data
----+---------+------
4 | 2 | 4.2
3 | 1 | 3.1
(2 rows)
Query 2:
test_db=# SELECT t.id, t.version, t.data FROM foo t
test_db-# JOIN (SELECT id, max(version) as version from foo GROUP BY id) q
test_db-# ON t.id = q.id AND t.version = q.version;
id | version | data
----+---------+------
3 | 1 | 3.1
2 | 2 | 2.2
1 | 3 | 1.3
4 | 2 | 4.2
(4 rows)
test_db=# SELECT t.id, t.version, t.data FROM foo t
test_db-# JOIN (SELECT id, max(version) as version from foo GROUP BY id) q
test_db-# ON t.id = q.id AND t.version = q.version
test_db-# ORDER BY id DESC LIMIT 2;
id | version | data
----+---------+------
4 | 2 | 4.2
3 | 1 | 3.1
(2 rows)