I want to copy rows from one table t2 to another t1, while excluding rows with values already existing in t1. The usual approach of 'NOT IN' works fine but only as long there are not multiple occurences of the same value in the source table t2.
Now, assuming I have two tables with the schema:
CREATE TABLE t1 ( id INTEGER );
CREATE TABLE t2 ( id INTEGER );
then insert data into them like:
INSERT INTO t1 VALUES (1);
INSERT INTO t2 VALUES (1);
INSERT INTO t2 VALUES (2);
Now, I try to insert all data from t2 into t1 but exclude pre-existing in t1:
INSERT INTO t1 (id) SELECT t2.id FROM t2
WHERE t2.id NOT IN ( SELECT t1.id FROM t1 WHERE t1.id = t2.id );
it works flawlessly; the row in t2 with the value of '1' did not get insert a second time into t1:
SELECT * FROM t1;
id
----
1
2
(2 rows)
But when there are multiple occurences of the same value in t2 it doesn't check if they exist in t1 for each individual insert, but for the whole transaction as it seems. Let's continue with my example by:
DELETE FROM t1;
INSERT INTO t2 VALUES (2);
SELECT * FROM t2;
id
----
1
2
2
(3 rows)
INSERT INTO t1 (id) SELECT t2.id FROM t2
WHERE t2.id NOT IN ( SELECT t1.id FROM t1 WHERE t1.id = t2.id );
SELECT * FROM t1;
id
----
1
2
2
(3 rows)
The same result is achieved with WHERE NOT EXISTS as well.
Has anyone an idea of how to check for existing values in t1 on an individual row-level to prevent multiple occurences?
I could as well use ON CONFLICT DO ... but I rather not want to since the idea is to split the data coming from t2 into a "clean" t1 and a "dirty" t1_faulty where all the rows are collected which do not fit some given criteria (one of which the uniqueness of id for which I am asking this question).