Simple batch DELETE then INSERT procedure some 1000 times slower than executing the statements one after the other

Question

In arather simple table with an composite primary key (see DDL) there are about 40k records.

create table la_ezg
(
    be_id                integer not null,
    usage                text    not null,
    area                 numeric(18, 6),
    sk_area              numeric(18, 6),
    count_ezg            numeric(18, 6),
    ...
    ...
    constraint la_ezg_pkey
        primary key (be_id, usage)
);

There is also a simple procedure which purpose is to delete rows with a certain be_id and persist the rows from another view where they are "generated"


CREATE OR REPLACE function pr_create_la_ezg(pBE_ID numeric) returns void as
$$
begin
    delete from la_ezg where be_id = pBE_ID;

    insert into la_ezg_(BE_ID, USAGE, ...)
    select be_id, usage, ...
    from vw_la_ezg_with_usage
    where be_id = pBE_ID;
END;
$$ language plpgsql;

The procedure need about 7 Minutes to execute...

Both Statements (DELETE and INSERT) execute in less than 100ms on the very same be_id.

There are a lot of different locks happening in pg_lock during that 7 Minutes but I wasn't able to figure out what exactly is going on inside this transaction and if there is some kind of deadlocking. After all the procedure is returning successful, but it needs way too much time doing it.

EDIT (activated 'auto_explain' and ran all three queries again):

 duration: 1.420 ms  plan:
 Query Text: delete from la_ezg where be_id=790696
 Delete on la_ezg  (cost=4.33..22.89 rows=5 width=6)
 ->  Bitmap Heap Scan on la_ezg  (cost=4.33..22.89 rows=5 width=6)
 Output: ctid
 Recheck Cond: (la_ezg.be_id = 790696)
 ->  Bitmap Index Scan on sys_c0073325  (cost=0.00..4.33 rows=5 width=0)
 Index Cond: (la_ezg.be_id = 790696)
 1 row affected in 107 ms

 duration: 71.645 ms  plan:
 Query Text: insert into la_ezg(BE_ID,USAGE,...)
 select be_id,USAGE,... from vw_la_ezg_with_usage where be_id=790696
 Insert on la_ezg  (cost=1343.71..2678.87 rows=1 width=228)
 ->  Nested Loop  (cost=1343.71..2678.87 rows=1 width=228)
 Output: la_ezg_geo.be_id, usage.nutzungsart, COALESCE(round(((COALESCE(st_area(la_ezg_geo.geometry), '3'::double precision) / '10000'::double precision))::numeric, 2), '0'::numeric), NULL::numeric, COALESCE((count(usage.nutzungsart)), '0'::bigint), COALESCE(round((((sum(st_area(st_intersection(ezg.geometry, usage.geom)))) / '10000'::double precision))::numeric, 2), '0'::numeric), COALESCE(round(((((sum(st_area(st_intersection(ezg.geometry, usage.geom)))) * '100'::double precision) / COALESCE(st_area(la_ezg_geo.geometry), '3'::double precision)))::numeric, 2), '0'::numeric), NULL::character varying, NULL::timestamp without time zone, NULL::character varying, NULL::timestamp without time zone
 ->  GroupAggregate  (cost=1343.71..1343.76 rows=1 width=41)
 Output: ezg.be_id, usage.nutzungsart, sum(st_area(st_intersection(ezg.geometry, usage.geom))), count(usage.nutzungsart)
 Group Key: ezg.be_id, usage.nutzungsart
 ->  Sort  (cost=1343.71..1343.71 rows=1 width=1834)
 Output: ezg.be_id, usage.nutzungsart, ezg.geometry, usage.geom
 Sort Key: usage.nutzungsart
 ->  Nested Loop  (cost=0.42..1343.70 rows=1 width=1834)
 Output: ezg.be_id, usage.nutzungsart, ezg.geometry, usage.geom
 ->  Seq Scan on la_ezg_geo ezg  (cost=0.00..1335.00 rows=1 width=1516)
 Output: ezg.objectid, ezg.be_id, ezg.name, ezg.se_anno_cad_data, ezg.benutzer_geaendert, ezg.datum_geaendert, ezg.status, ezg.benutzer_erstellt, ezg.datum_erstellt, ezg.len, ezg.geometry, ezg.temp_char, ezg.vulgo, ezg.flaeche, ezg.hauptgemeinde, ezg.prozessart, ezg.verbauungsgrad, ezg.verordnung_txt, ezg.gemeinden_txt, ezg.hinderungsgrund, ezg.kompetenz, ezg.seehoehe_min, ezg.seehoehe_max, ezg.neigung_min, ezg.neigung_max, ezg.exposition
 Filter: (ezg.be_id = 790696)
 ->  Index Scan using dkm_nutz_fl_geom_1551355663100174000 on dkm.dkm_nutz_fl nutzung  (cost=0.42..8.69 rows=1 width=318)
 Output: usage.gdo_gid, usage.gst, usage.nutzungsart, usage.nutzungsabschnitt, usage.statistik, usage.flaeche, usage.kennung, usage.von_datum, usage.bis_datum, usage.von_az, usage.bis_az, usage.projekt, usage.fme_basename, usage.fme_dataset, usage.fme_feature_type, usage.fme_type, usage.oracle_srid, usage.geom
 Index Cond: ((usage.geom && ezg.geometry) AND (usage.geom && ezg.geometry))
 Filter: _st_intersects(usage.geom, ezg.geometry)
 ->  Seq Scan on la_ezg_geo  (cost=0.00..1335.00 rows=1 width=1516)
 Output: la_ezg_geo.objectid, la_ezg_geo.be_id, la_ezg_geo.name, la_ezg_geo.se_anno_cad_data, la_ezg_geo.benutzer_geaendert, la_ezg_geo.datum_geaendert, la_ezg_geo.status, la_ezg_geo.benutzer_erstellt, la_ezg_geo.datum_erstellt, la_ezg_geo.len, la_ezg_geo.geometry, la_ezg_geo.temp_char, la_ezg_geo.vulgo, la_ezg_geo.flaeche, la_ezg_geo.hauptgemeinde, la_ezg_geo.prozessart, la_ezg_geo.verbauungsgrad, la_ezg_geo.verordnung_txt, la_ezg_geo.gemeinden_txt, la_ezg_geo.hinderungsgrund, la_ezg_geo.kompetenz, la_ezg_geo.seehoehe_min, la_ezg_geo.seehoehe_max, la_ezg_geo.neigung_min, la_ezg_geo.neigung_max, la_ezg_geo.exposition
 Filter: (la_ezg_geo.be_id = 790696)
 1 row affected in 149 ms

 duration: 421851.819 ms  plan:
 Query Text: select pr_create_la_ezg(790696)
 Result  (cost=0.00..0.26 rows=1 width=4)
 Output: pr_create_la_ezg('790696'::numeric)
 1 row retrieved starting from 1 in 7 m 1 s 955 ms (execution: 7 m 1 s 929 ms, fetching: 26 ms)

P.S. I shortened some of the queries and names for the sake of readability

P.P.S. This database is a legacy migration project. Like in this case there are often views dependent on views in multiple layers. I´d like to streamline all this but Ia m in a desperate need to debug whats going on inside such an transaction, otherwise I would have to rebuild nearly all with the risk of breaking things

Just to make sure that I understand your question correctly: each statement takes 100ms on its own, but when you combine them in the function it suddenly takes 7 minutes? (instead of the expected ~200ms) — , Nov 23 '20 at 09:31
Use `auto_explain` to get the execution plans of the statements in the function and look for differences. — Laurenz Albe, Nov 23 '20 at 10:13
well, auto_explain shows me very verbose execution plans of the two single queries, but executing the procedure, I just get "result" (cost, rows) and output which isn't very informative in this case — Jürgen Zornig, Nov 23 '20 at 10:59
Does it make any difference if you make it a `language sql` function instead? During those 7 minutes can you check what the other sessions that are causing the locks are doing? You can see the PIDs that are blocking the delete or insert using `pg_blocking_pids()` — , Nov 23 '20 at 11:36
Looking at the explain output, it seems one of the statements in the function is waiting for 7 minutes before it's actually started as the actual execution time is as expected. Do you have concurrent inserts into the table with possibly the same primary key? — , Nov 23 '20 at 11:40
@a_horse_with_no_name the first two plans are if DELETE and INSERT are run per single query execution. The third plan ist running the procedure...I´m not sure if I understand "as the actual execution time is as expected"...I just get no plan whats happening inside the procedure. I also tried the procedure as a plain `language sql` procedure where the behaviour is the same...7 Minutes. I searched for blocking pids like described [here](https://stackoverflow.com/questions/26489244/how-to-detect-query-which-holds-the-lock-in-postgres) but according to that query, there are no blocking queries. — Jürgen Zornig, Nov 23 '20 at 12:21
Ah, I thought the two statements were the output when running the function. What wait event do you see in `pg_stat_activity` for the session running the function during those 7 minutes? `state`, `wait_event_type` and `wait_event` would be the interesting columns. — , Nov 23 '20 at 12:22
The views which are involved here are rather complicated...contain some spatial joins, Spatial aggregates an so on. I suppose that by calling it from the procedure, the query optimizer doesn't use the indices anymore in the correct order (Tablescanning instead of spatial index and so on). At the moment I am triying to simplify these views hoping this may result in a better execution plan somehow. Inside the transaction block, the QO "knows" that all records with that specific ID where DELETEd, perhaps thats the reason which results in another plan — Jürgen Zornig, Nov 23 '20 at 12:32
I was able to workaround / solve this, by replicating the whole view definition inside this procedure and setting the `where be_id = pBE_ID` some levels deeper down the dependency tree. Does not really answer why the view is working blazing fast when its called outside a procedure though. — Jürgen Zornig, Nov 23 '20 at 14:52

Simple batch DELETE then INSERT procedure some 1000 times slower than executing the statements one after the other

0 Answers0