0

There are a number of SQL Statement I have to repeat on a regular basis. After my PosgreSQL Version was updated from 10 to 13 one particular querytakes 7 hours instead of 1:30min.

That's the query:

SELECT 
    polyssqlspatial.objid, pointssqlspatial.objid
from 
    (polyssqlspatial join ax11001 on polyssqlspatial.objid = ax11001_objid) 
join 
        (pointssqlspatial join AX14004 on  AX14004_objid = pointssqlspatial.objid) 
on 
        ST_Touches(pointssqlspatial.shape, polyssqlspatial.shape) or ST_Within(pointssqlspatial.shape, polyssqlspatial.shape);

polyssqlspatial and pointssqlspatial are the geometries ax11001 and ax14004 contain data connected to the geometries.

Basically, the statement is just returning all geometries if the points touch the polygons or are inside them.

I have 45000 polygons and 95327 Points.

Does anyone have ideas why this query takes 7h in PostgreSQL 13 and how to reduce the runtime again? It used to be 1:30min.

EDIT: Query Plan for PostgreSQL 10 Server:

 "QUERY PLAN"
"Gather  (cost=14495.04..4907988.66 rows=2146510 width=77) (actual time=66.999..83240.204 rows=373890 loops=1)"
"  Output: polyssqlspatial.objid, pointssqlspatial.objid"
"  Workers Planned: 2"
"  Workers Launched: 2"
"  Buffers: shared hit=4138168"
"  ->  Hash Join  (cost=13495.04..4692337.66 rows=894379 width=77) (actual time=81.475..83104.722 rows=124630 loops=3)"
"        Output: polyssqlspatial.objid, pointssqlspatial.objid"
"        Hash Cond: (polyssqlspatial.objid = (ax11001.ax11001_objid)::bpchar)"
"        Buffers: shared hit=4138168"
"        Worker 0: actual time=89.307..83099.719 rows=124105 loops=1"
"          Buffers: shared hit=1374659"
"        Worker 1: actual time=88.927..83129.914 rows=124533 loops=1"
"          Buffers: shared hit=1379935"
"        ->  Nested Loop  (cost=8534.07..4668814.00 rows=2565040 width=77) (actual time=52.892..82607.729 rows=364807 loops=3)"
"              Output: polyssqlspatial.objid, pointssqlspatial.objid"
"              Buffers: shared hit=3935399"
"              Worker 0: actual time=56.983..82605.787 rows=363104 loops=1"
"                Buffers: shared hit=1306956"
"              Worker 1: actual time=56.789..82638.076 rows=364934 loops=1"
"                Buffers: shared hit=1311962"
"              ->  Hash Join  (cost=8532.41..25562.37 rows=59711 width=79) (actual time=52.276..185.790 rows=47769 loops=3)"
"                    Output: pointssqlspatial.objid, pointssqlspatial.shape"
"                    Hash Cond: (pointssqlspatial.objid = (ax14004x11003.ax14004x11003_objid)::bpchar)"
"                    Buffers: shared hit=29811"
"                    Worker 0: actual time=56.310..178.166 rows=47624 loops=1"
"                      Buffers: shared hit=8487"
"                    Worker 1: actual time=56.298..184.014 rows=47693 loops=1"
"                      Buffers: shared hit=8846"
"                    ->  Parallel Seq Scan on public.pointssqlspatial  (cost=0.00..15738.53 rows=185153 width=79) (actual time=0.006..37.606 rows=148123 loops=3)"
"                          Output: pointssqlspatial.objid, pointssqlspatial.layerid, pointssqlspatial.povchanged, pointssqlspatial.georestr, pointssqlspatial.dokucount, pointssqlspatial.defid, pointssqlspatial.objpri, pointssqlspatial.objclass, pointssqlspatial.alpha, pointssqlspatial.isdelta, pointssqlspatial.deltar, pointssqlspatial.deltah, pointssqlspatial.sigwidth, pointssqlspatial.sigheight, pointssqlspatial.shape, pointssqlspatial.tesselcode, pointssqlspatial.tesselnumh, pointssqlspatial.tesselnumr"
"                          Buffers: shared hit=13887"
"                          Worker 0: actual time=0.007..35.803 rows=101712 loops=1"
"                            Buffers: shared hit=3179"
"                          Worker 1: actual time=0.007..38.871 rows=113216 loops=1"
"                            Buffers: shared hit=3538"
"                    ->  Hash  (cost=6741.07..6741.07 rows=143307 width=39) (actual time=51.947..51.947 rows=143307 loops=3)"
"                          Output: ax14004x11003.ax14004x11003_objid"
"                          Buckets: 262144  Batches: 1  Memory Usage: 11985kB"
"                          Buffers: shared hit=15924"
"                          Worker 0: actual time=55.980..55.980 rows=143307 loops=1"
"                            Buffers: shared hit=5308"
"                          Worker 1: actual time=55.985..55.985 rows=143307 loops=1"
"                            Buffers: shared hit=5308"
"                          ->  Seq Scan on public.ax14004x11003  (cost=0.00..6741.07 rows=143307 width=39) (actual time=0.106..26.172 rows=143307 loops=3)"
"                                Output: ax14004x11003.ax14004x11003_objid"
"                                Buffers: shared hit=15924"
"                                Worker 0: actual time=0.143..30.242 rows=143307 loops=1"
"                                  Buffers: shared hit=5308"
"                                Worker 1: actual time=0.165..29.961 rows=143307 loops=1"
"                                  Buffers: shared hit=5308"
"              ->  Bitmap Heap Scan on public.polyssqlspatial  (cost=1.66..77.29 rows=47 width=386) (actual time=0.367..1.716 rows=8 loops=143306)"
"                    Output: polyssqlspatial.objid, polyssqlspatial.layerid, polyssqlspatial.povchanged, polyssqlspatial.georestr, polyssqlspatial.dokucount, polyssqlspatial.defid, polyssqlspatial.objpri, polyssqlspatial.objclass, polyssqlspatial.rsp, polyssqlspatial.hsp, polyssqlspatial.alphasp, polyssqlspatial.sigbegin, polyssqlspatial.sigend, polyssqlspatial.schalter, polyssqlspatial.alphafs, polyssqlspatial.sizemm, polyssqlspatial.shape, polyssqlspatial.tesselcode, polyssqlspatial.tesselnumh, polyssqlspatial.tesselnumr"
"                    Recheck Cond: ((pointssqlspatial.shape && polyssqlspatial.shape) OR (polyssqlspatial.shape ~ pointssqlspatial.shape))"
"                    Filter: (((pointssqlspatial.shape && polyssqlspatial.shape) AND _st_touches(pointssqlspatial.shape, polyssqlspatial.shape)) OR ((polyssqlspatial.shape ~ pointssqlspatial.shape) AND _st_contains(polyssqlspatial.shape, pointssqlspatial.shape)))"
"                    Rows Removed by Filter: 8"
"                    Heap Blocks: exact=664156"
"                    Buffers: shared hit=3905588"
"                    Worker 0: actual time=0.368..1.721 rows=8 loops=47624"
"                      Buffers: shared hit=1298469"
"                    Worker 1: actual time=0.367..1.719 rows=8 loops=47693"
"                      Buffers: shared hit=1303116"
"                    ->  BitmapOr  (cost=1.66..1.66 rows=142 width=0) (actual time=0.159..0.159 rows=0 loops=143306)"
"                          Buffers: shared hit=1916131"
"                          Worker 0: actual time=0.160..0.160 rows=0 loops=47624"
"                            Buffers: shared hit=636834"
"                          Worker 1: actual time=0.159..0.159 rows=0 loops=47693"
"                            Buffers: shared hit=639450"
"                          ->  Bitmap Index Scan on spatial_idx_polyssqlspatial  (cost=0.00..0.38 rows=13 width=0) (actual time=0.083..0.083 rows=15 loops=143306)"
"                                Index Cond: (pointssqlspatial.shape && polyssqlspatial.shape)"
"                                Buffers: shared hit=959006"
"                                Worker 0: actual time=0.083..0.083 rows=15 loops=47624"
"                                  Buffers: shared hit=318734"
"                                Worker 1: actual time=0.083..0.083 rows=15 loops=47693"
"                                  Buffers: shared hit=320027"
"                          ->  Bitmap Index Scan on spatial_idx_polyssqlspatial  (cost=0.00..1.25 rows=129 width=0) (actual time=0.075..0.075 rows=15 loops=143306)"
"                                Index Cond: (polyssqlspatial.shape ~ pointssqlspatial.shape)"
"                                Buffers: shared hit=957125"
"                                Worker 0: actual time=0.075..0.075 rows=15 loops=47624"
"                                  Buffers: shared hit=318100"
"                                Worker 1: actual time=0.075..0.075 rows=15 loops=47693"
"                                  Buffers: shared hit=319423"
"        ->  Hash  (cost=4399.32..4399.32 rows=44932 width=39) (actual time=28.007..28.007 rows=44932 loops=3)"
"              Output: ax11001.ax11001_objid"
"              Buckets: 65536  Batches: 1  Memory Usage: 3628kB"
"              Buffers: shared hit=11850"
"              Worker 0: actual time=31.586..31.586 rows=44932 loops=1"
"                Buffers: shared hit=3950"
"              Worker 1: actual time=31.563..31.563 rows=44932 loops=1"
"                Buffers: shared hit=3950"
"              ->  Seq Scan on public.ax11001  (cost=0.00..4399.32 rows=44932 width=39) (actual time=0.080..20.297 rows=44932 loops=3)"
"                    Output: ax11001.ax11001_objid"
"                    Buffers: shared hit=11850"
"                    Worker 0: actual time=0.145..23.801 rows=44932 loops=1"
"                      Buffers: shared hit=3950"
"                    Worker 1: actual time=0.081..23.693 rows=44932 loops=1"
"                      Buffers: shared hit=3950"
"Planning time: 24.645 ms"
"Execution time: 83260.579 ms"

Query Plan for PostgreSQL 13 Server

    "Nested Loop  (cost=7653.57..426137274.01 rows=938 width=77) (actual time=43124.328..40067741.585 rows=373890 loops=1)"
"  Output: polyssqlspatial.objid, pointssqlspatial.objid"
"  Join Filter: (polyssqlspatial.objid = (ax11001.ax11001_objid)::bpchar)"
"  Rows Removed by Join Filter: 49174195414"
"  Buffers: shared read=323, temp read=521648910 written=11610"
"  ->  Index Only Scan using ixax11001_objid on public.ax11001  (cost=0.41..1489.39 rows=44932 width=39) (actual time=14.773..203.962 rows=44932 loops=1)"
"        Output: ax11001.ax11001_objid"
"        Heap Fetches: 0"
"        Buffers: shared hit=1 read=323"
"  ->  Materialize  (cost=7653.15..424321437.18 rows=2692 width=77) (actual time=0.027..368.951 rows=1094422 loops=44932)"
"        Output: polyssqlspatial.objid, pointssqlspatial.objid"
"        Buffers: shared hit=4561020, temp read=521648910 written=11610"
"        ->  Gather  (cost=7653.15..424321423.72 rows=2692 width=77) (actual time=116.830..40941.662 rows=1094422 loops=1)"
"              Output: polyssqlspatial.objid, pointssqlspatial.objid"
"              Workers Planned: 2"
"              Workers Launched: 2"
"              Buffers: shared hit=4561020"
"              ->  Nested Loop  (cost=6653.15..424320154.52 rows=1122 width=77) (actual time=55.248..41528.751 rows=364807 loops=3)"
"                    Output: polyssqlspatial.objid, pointssqlspatial.objid"
"                    Buffers: shared hit=4561020"
"                    Worker 0:  actual time=22.459..42090.385 rows=374235 loops=1"
"                      Buffers: shared hit=1558962"
"                    Worker 1:  actual time=28.819..41980.088 rows=365154 loops=1"
"                      Buffers: shared hit=1515646"
"                    ->  Parallel Hash Join  (cost=6651.50..23781.16 rows=59711 width=79) (actual time=54.054..317.307 rows=47769 loops=3)"
"                          Output: pointssqlspatial.objid, pointssqlspatial.shape"
"                          Hash Cond: (pointssqlspatial.objid = (ax14004x11003.ax14004x11003_objid)::bpchar)"
"                          Buffers: shared hit=19716"
"                          Worker 0:  actual time=20.690..315.936 rows=48973 loops=1"
"                            Buffers: shared hit=5797"
"                          Worker 1:  actual time=27.753..275.461 rows=47845 loops=1"
"                            Buffers: shared hit=4934"
"                          ->  Parallel Seq Scan on public.pointssqlspatial  (cost=0.00..16186.53 rows=185153 width=79) (actual time=0.024..55.810 rows=148123 loops=3)"
"                                Output: pointssqlspatial.objid, pointssqlspatial.layerid, pointssqlspatial.tesselcode, pointssqlspatial.tesselnumh, pointssqlspatial.tesselnumr, pointssqlspatial.povchanged, pointssqlspatial.georestr, pointssqlspatial.dokucount, pointssqlspatial.defid, pointssqlspatial.objpri, pointssqlspatial.objclass, pointssqlspatial.alpha, pointssqlspatial.isdelta, pointssqlspatial.deltar, pointssqlspatial.deltah, pointssqlspatial.sigwidth, pointssqlspatial.sigheight, pointssqlspatial.shape"
"                                Buffers: shared hit=14335"
"                                Worker 0:  actual time=0.024..64.994 rows=160983 loops=1"
"                                  Buffers: shared hit=5193"
"                                Worker 1:  actual time=0.026..49.194 rows=128340 loops=1"
"                                  Buffers: shared hit=4140"
"                          ->  Parallel Hash  (cost=5905.11..5905.11 rows=59711 width=39) (actual time=52.555..52.557 rows=47769 loops=3)"
"                                Output: ax14004x11003.ax14004x11003_objid"
"                                Buckets: 262144  Batches: 1  Memory Usage: 12192kB"
"                                Buffers: shared hit=5308"
"                                Worker 0:  actual time=20.150..20.151 rows=15354 loops=1"
"                                  Buffers: shared hit=569"
"                                Worker 1:  actual time=27.197..27.199 rows=20493 loops=1"
"                                  Buffers: shared hit=759"
"                                ->  Parallel Seq Scan on public.ax14004x11003  (cost=0.00..5905.11 rows=59711 width=39) (actual time=0.026..31.099 rows=47769 loops=3)"
"                                      Output: ax14004x11003.ax14004x11003_objid"
"                                      Buffers: shared hit=5308"
"                                      Worker 0:  actual time=0.028..11.930 rows=15354 loops=1"
"                                        Buffers: shared hit=569"
"                                      Worker 1:  actual time=0.027..16.306 rows=20493 loops=1"
"                                        Buffers: shared hit=759"
"                    ->  Bitmap Heap Scan on public.polyssqlspatial  (cost=1.65..7105.57 rows=26 width=385) (actual time=0.409..0.846 rows=8 loops=143306)"
"                          Output: polyssqlspatial.objid, polyssqlspatial.layerid, polyssqlspatial.tesselcode, polyssqlspatial.tesselnumh, polyssqlspatial.tesselnumr, polyssqlspatial.povchanged, polyssqlspatial.georestr, polyssqlspatial.dokucount, polyssqlspatial.defid, polyssqlspatial.objpri, polyssqlspatial.objclass, polyssqlspatial.rsp, polyssqlspatial.hsp, polyssqlspatial.alphasp, polyssqlspatial.sigbegin, polyssqlspatial.sigend, polyssqlspatial.schalter, polyssqlspatial.alphafs, polyssqlspatial.sizemm, polyssqlspatial.shape"
"                          Recheck Cond: (st_touches(pointssqlspatial.shape, polyssqlspatial.shape) OR st_within(pointssqlspatial.shape, polyssqlspatial.shape))"
"                          Filter: (st_touches(pointssqlspatial.shape, polyssqlspatial.shape) OR st_within(pointssqlspatial.shape, polyssqlspatial.shape))"
"                          Rows Removed by Filter: 8"
"                          Heap Blocks: exact=616261"
"                          Buffers: shared hit=4541304"
"                          Worker 0:  actual time=0.402..0.835 rows=8 loops=48973"
"                            Buffers: shared hit=1553165"
"                          Worker 1:  actual time=0.422..0.855 rows=8 loops=47845"
"                            Buffers: shared hit=1510712"
"                          ->  BitmapOr  (cost=1.65..1.65 rows=142 width=0) (actual time=0.301..0.301 rows=0 loops=143306)"
"                                Buffers: shared hit=2640396"
"                                Worker 0:  actual time=0.292..0.292 rows=0 loops=48973"
"                                  Buffers: shared hit=904433"
"                                Worker 1:  actual time=0.315..0.315 rows=0 loops=47845"
"                                  Buffers: shared hit=874797"
"                                ->  Bitmap Index Scan on spatial_idx_polyssqlspatial  (cost=0.00..0.38 rows=13 width=0) (actual time=0.158..0.158 rows=15 loops=143306)"
"                                      Index Cond: (polyssqlspatial.shape && pointssqlspatial.shape)"
"                                      Buffers: shared hit=1321443"
"                                      Worker 0:  actual time=0.150..0.150 rows=15 loops=48973"
"                                        Buffers: shared hit=452642"
"                                      Worker 1:  actual time=0.171..0.171 rows=15 loops=47845"
"                                        Buffers: shared hit=437805"
"                                ->  Bitmap Index Scan on spatial_idx_polyssqlspatial  (cost=0.00..1.25 rows=129 width=0) (actual time=0.142..0.142 rows=15 loops=143306)"
"                                      Index Cond: (polyssqlspatial.shape ~ pointssqlspatial.shape)"
"                                      Buffers: shared hit=1318953"
"                                      Worker 0:  actual time=0.140..0.140 rows=15 loops=48973"
"                                        Buffers: shared hit=451791"
"                                      Worker 1:  actual time=0.142..0.142 rows=15 loops=47845"
"                                        Buffers: shared hit=436992"
"Planning:"
"  Buffers: shared hit=477 dirtied=1"
"Planning Time: 77.195 ms"
"Execution Time: 40068039.569 ms"
bluesky
  • 1
  • 2
  • 1
    did you `analyze` all tables to update statistics after the upgrade? –  Dec 07 '22 at 10:39
  • Btw: all those parentheses for the joins are unnecessary (and I find them confusing to be honest) –  Dec 07 '22 at 10:40
  • The joins on `ax11001` and `ax14004` might be replaceable with EXISTS condition unless you do need the potential duplicates from `pointssqlspatial` and `polyssqlspatial` that these joins can generate. –  Dec 07 '22 at 10:43
  • `st_touch(point,polygon) or st_within(point,polygon)` can be reduced to a single `st_intersects(point,polygon)`. Being right on the boundary or entirely inside the polygon are the only ways a point can intersect a polygon. – Zegarek Dec 07 '22 at 10:55
  • 1
    Could you please share the results from EXPLAIN(ANALYZE, VERBOSE, BUFFERS, COSTS) for this statement on both servers? Looks like the query plan is wrong, but without a plan, it's impossible to see what is wrong. – Frank Heikens Dec 07 '22 at 11:00
  • I executed analyze, vacuum and reindex on all tables. Execution time took 1-10 seconds for each command and table. After, I edited the query as Zegarek suggested, the query is working for 20min now. I guess I can wait until it's over an check if it was faster than 7h. But even 30min is quite a lot. It used to be 1:30min... Is there something like a detailed log file? I wonder if each intersection check takes equally long or if there is corrupted geometry or sth. like that – bluesky Dec 07 '22 at 11:29
  • 1
    Put the `EXPLAIN (ANALYZE, VERBOSE, BUFFERS, COSTS) ` suggested earlier, right in front of your statement, run that and it'll generate the "detailed log" - a plan, how it obtains the results, what indexes it uses, where and how many rows it dealt with etc. If you could paste that plan here along with the `create` statements for these tables and their indexes, it'll be easier to work this out. If there's something slowing you down from 1min30' to 7h+, my remark about geometry relations won't be enough to make up for that. – Zegarek Dec 07 '22 at 11:32
  • EXPLAIN (ANALYZE, VERBOSE, BUFFERS, COSTS) worked for the fast PostgreSQL 10 Server. I'm running the same on PostgreSQL 13 query again. This will take 7h+ again, I guess but I will study and upload the result after it's completion. Thanks for the help so far. – bluesky Dec 07 '22 at 12:26
  • You can run just EXPLAIN on the v13 while you wait for the other one to finish. The results of that might provided enough info to figure out what is going on without waiting for the full results. – jjanes Dec 07 '22 at 15:40
  • just added the result of the query plan for PostgreSQL 13. It seems that the values are quite similar. The only real difference seems to be the last line when the execution time is much larger on PG13. Planning time also tripled but 77ms still seems very short. When I find time, I will try to make sense of what is written there but it's not really the information we need right? – bluesky Dec 08 '22 at 07:48
  • The PostgreSQL plan shows only 938 estimated rows and 373890 actual. Such a bad estimate can lead to the selection of the wrong operators. With less than 1000 rows the optimizer can decide it's cheaper to just scan rows instead of using spatial indexes for example. Check the operators in the plan to find which one has too few estimated rows – Panagiotis Kanavos Dec 08 '22 at 07:50
  • Too few estimated rows mean missing or out-of-date statistics. – Panagiotis Kanavos Dec 08 '22 at 07:51
  • The second query uses a Nested Loop on the `objid` columns instead of a hash join, which expected 1100 rows instead of 300K. Try running ANALYZE on the two tables to update the statistics – Panagiotis Kanavos Dec 08 '22 at 07:58
  • I ran ANALYZE and afterwards tried the query again. It still uses a nested loop. Is there any other way to tell postgresql to use hash joins? Sorry for the late replys. I can only deal with this when I have spare time on my regular work. Thank you for all your help so far – bluesky Dec 19 '22 at 08:00

0 Answers0