Importing a large number of rows into Postgres with duplicate key violations

Question

Is it possible in Postgres 9.6 to get on duplicate key UPSERT functionality using the COPY command? I have a CSV file that I'm importing into Postgres, but it may contain some duplicate key violations, so the COPY command gives an error and terminates when it encounters them.

The file is very large so it might not be possible to pre-process it in application code (in order to handle rows that might lead to duplicate key violations) since all keys might not fit into memory.

What's the best way to import a very large number of rows into Postgres that might contain duplicate key violations?

copy has not upsert. you can copy to other table and then insert differece — Vao Tsun, May 01 '17 at 19:43
@VaoTsun Could you please post an answer with an example of what you mean? I'm not sure how to implement the 'insert difference' part. — user779159, May 01 '17 at 19:54
I did. in your case you have UK, so you have to insert from `left outer join...where original.uk_att is null`. if you want to insert rows that are different at all( all columns), use `except` in similar way — Vao Tsun, May 01 '17 at 20:06

score 3 · Answer 1 · answered May 01 '17 at 20:04

sample:

t=# create table s90(i int primary key, t text);
CREATE TABLE
t=# insert into s90 select 1,'a';
INSERT 0 1
t=# copy s90 from stdin delimiter ',';
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>> 1,'b'
>> 2,'c'
>> \.
ERROR:  duplicate key value violates unique constraint "s90_pkey"
DETAIL:  Key (i)=(1) already exists.
CONTEXT:  COPY s90, line 1

workaround with copy:

t=# create table s91 as select * from s90 where false;;
SELECT 0
t=# copy s91 from stdin delimiter ',';
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>> 1,'b'
>> 2,'c'
>> \.
COPY 2
t=# with p as (select s91.* from s91 left outer join s90 on s90.i=s91.i where s90.i is null)
insert into s90 select * from p;
INSERT 0 1
t=# select * from s90;
 i |  t
---+-----
 1 | a
 2 | 'c'
(2 rows)

Importing a large number of rows into Postgres with duplicate key violations

1 Answers1