i need help with a problem that drives me nuts.
Imagine you have the following table with just three ids.
CREATE TABLE candidates
(
row_num BIGINT,
in_id BIGINT,
out_id BIGINT
);
Fill it with some data:
INSERT INTO candidates (row_num, in_id, out_id)
VALUES (1, 220, 210),
(2, 208, 210),
(3, 216, 210),
(4, 220, 214),
(5, 208, 214),
(6, 216, 214),
(7, 212, 218);
What i now want is to select all rows in a way that each in_id
and out_id
is unique across all selected rows.
So we'd select the first row, because we didn't see either the in_id
220 nor the out_id
210 before. The pair (220,210)
is selected and therefore no following row is allowed to have the in_id
220 or the out_id
210. That's why we need to skip the next 3 rows (they either have in_id
220 or out_id
210). Now we end up at row_num
5. Neither 208 nor 214 where selected before. Thats why it also should be part of the final result set. Same goes for row_num
7.
The final result set should look like:
row_num | in_id | out_id
1 220 210
5 208 214
7 212 218
Seems so simple but i can't solve it. Cleanest try is the following query which has the problem that despite that it knows which ids appeared before it can't detect if a row was selected or not. Breaks for the case of row_num
5.
SELECT *
FROM candidates tc1
WHERE NOT EXISTS (
SELECT
FROM candidates tc2
WHERE (tc2.in_id = tc1.in_id OR tc2.out_id = tc1.out_id)
AND tc2.row_num < tc1.row_num
);
Any help greatly appreciated, thanks!