0

I use Postgis extension to figure out whether given points (geographic coordinates) are in a polygon. The problem is the query is too slow. For 100 points it takes about 7 secs to be executed.

There is some info:

db=$ \set mpol ST_GeomFromText(pg_read_file($$large_pol.txt$$))

db=# select st_npoints(:mpol);
 st_npoints 
------------
      39522
(1 row)

Time: 125.451 ms


db=# select count(*)
     from (select * from cargos limit 100) c
     where st_within(c.geo_origin_point::geometry, :mpol);

 count 
-------
     0
(1 row)

Time: 7532.868 ms (00:07.533)

I've tried to simplify the polygon with ST_Simplify, but so far that did not affect the query execution time at all.

db=# select st_npoints(st_simplify(:mpol, 1));
 st_npoints 
------------
         14
(1 row)

Time: 99.213 ms

db=# select count(*)
     from (select * from cargos limit 100) c
     where st_within(c.geo_origin_point::geometry, st_simplify(:mpol, 1));

 count 
-------
     0
(1 row)

Time: 7667.504 ms (00:07.668)

The column geo_origin_point in the table cargos:

db=# \dS cargos;
   geo_origin_point     | geography(Point,4326)       |           |          
Indexes:
    "cargos_geo_origin_point" gist (geo_origin_point)

I guess, there is something definitely wrong if even for 100 points it takes so much time.

For some reason the index is not being used:

db=# explain analyze select count(*) from (select * from cargos limit 100) c where st_within(c.geo_origin_point, st_simplify(:mpol, 1));
                                                                  QUERY PLAN                                                                  
----------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=31.31..31.32 rows=1 width=8) (actual time=8698.561..8698.562 rows=1 loops=1)
   ->  Subquery Scan on c  (cost=0.00..31.23 rows=33 width=0) (actual time=8698.555..8698.555 rows=0 loops=1)
         Filter: st_within(c.geo_origin_point, st_simplify(st_geomfromtext(pg_read_file('moscow_obl_pol.txt'::text)), '1'::double precision))
         Rows Removed by Filter: 100
         ->  Limit  (cost=0.00..4.48 rows=100 width=2633) (actual time=0.042..0.580 rows=100 loops=1)
               ->  Seq Scan on cargos  (cost=0.00..2121.67 rows=47367 width=2633) (actual time=0.038..0.516 rows=100 loops=1)
 Planning Time: 0.186 ms
 Execution Time: 8698.641 ms
(8 rows)

UPDATE:

I just found out that if I don't load polygon data from the file the query is being executed almost instatnly. That's quite weird.

db=# select count(*) from cargos c where st_within(c.geo_origin_point, ST_GeomFromText('MULTIPOLYGON(((-71.1031880899493 42.3152774590236, -71.1031627617667 42.3152960829043,-71.102923838298 42.3149156848307, -71.1023097974109 42.3151969047397,-71.1019285062273 42.3147384934248, -71.102505233663 42.3144722937587,-71.10277487471 42.3141658254797, -71.103113945163 42.3142739188902,-71.10324876416 42.31402489987, -71.1033002961013 42.3140393340215,-71.1033488797549 42.3139495090772, -71.103396240451 42.3138632439557,-71.1041521907712 42.3141153348029, -71.1041411411543 42.3141545014533,-71.1041287795912 42.3142114839058, -71.1041188134329 42.3142693656241,-71.1041112482575 42.3143272556118, -71.1041072845732 42.3143851580048,-71.1041057218871 42.3144430686681, -71.1041065602059 42.3145009876017,-71.1041097995362 42.3145589148055, -71.1041166403905 42.3146168544148,-71.1041258822717 42.3146748022936, -71.1041375307579 42.3147318674446,-71.1041492906949 42.3147711126569, -71.1041598612795 42.314808571739,-71.1042515013869 42.3151287620809, -71.1041173835118 42.3150739481917,-71.1040809891419 42.3151344119048, -71.1040438678912 42.3151191367447,-71.1040194562988 42.3151832057859, -71.1038734225584 42.3151140942995,-71.1038446938243 42.3151006300338, -71.1038315271889 42.315094347535,-71.1037393329282 42.315054824985, -71.1035447555574 42.3152608696313,-71.1033436658644 42.3151648370544, -71.1032580383161 42.3152269126061,-71.103223066939 42.3152517403219, -71.1031880899493 42.3152774590236)))',4326));

 count 
-------
     0
(1 row)

Time: 2.251 ms
Agent Coop
  • 392
  • 4
  • 12

1 Answers1

2

The first query st_contains(c.geo_origin_point::geometry, :mpol); will never return anything as it looks for polygons that are contained in points.. You would need to swap the parameters, or to use st_within as in the 2nd query.

The bottleneck is that the spatial index is not used. You have indexed the geography but you are using a conversion to geometry. Try indexing the convertion then:

CREATE INDEX cargos_geom_origin_point ON public.cargos USING gist((geo_origin_point::geometry)); 
JGH
  • 15,928
  • 4
  • 31
  • 48
  • Yeah, that was my mistake, I corrected it. Your query fails with: ERROR: syntax error at or near "::" LINE 1: ...point ON public.cargos USING gist(geo_origin_point::geometry... – Agent Coop Nov 29 '18 at 13:25
  • @Gustav.Calder I forgot an extra set of (), please try again – JGH Nov 29 '18 at 13:28