0

I have project where I have to insert a bunch of records periodically (~5k record in 10 minutes periods) from a remote source. I have a staging table which has the same structure as the final target table (no primary key, id is varchar and not unique), to ensure there won't be any duplicates, I'm using the next sql command:

insert into g_his 
select * 
from tmp_his h 
where not exists (select id, times 
                  from g_his g 
                  where g.id = h.id 
                    and g.times = h.times);

After this finished the tmp_his table is truncated.

Is it possible this gets an implicit lock on the g_his table and prevents any other inserts? Or can it be a heavy load? Or any better idea to do this?

DB is Postgres 9.6

posthy
  • 47
  • 6
  • 1
    Why there is no constraint on id in the staging table? If there is a constraint in staging table, then you may not need that not exists in your insert statement. this will increase performance – Valli Oct 03 '17 at 17:49
  • No primary key? Not unique? If you want anarchy you're going to get it. – tadman Oct 03 '17 at 18:37
  • Sometimes for a staging table it's easy to get the data in there then worry about cleaning it up... I often to staging without constraints. – jleach Oct 03 '17 at 18:52
  • Your code implies that the primary key of the `g_his` table is `id` and `times` -- is this correct? You might get better performance by indexing these two columns in your staging table and using a left join with an `is null` condition on one of the joined columns in the staging table. Depending on the frequency of the imports, you may want to manually vacuum that table after each use. – rd_nielsen Oct 03 '17 at 18:56
  • An insert will never result in a _table_ lock. –  Oct 04 '17 at 07:44

1 Answers1

0

Create a unique constraint on g_his(id, times) and use

INSERT INTO g_his 
   SELECT * 
   FROM tmp_his h
ON CONFLICT (id, times) DO NOTHING;
Laurenz Albe
  • 209,280
  • 17
  • 206
  • 263
  • As it's always this is a legacy system, we already have an index on these two columns, just not unique. I will look into if it could be and try to go this way. Thanks. – posthy Oct 04 '17 at 08:39