1

The routed_way table has a column named way where the path of a vehicle is stored. The length of these paths can vary greatly, with some being short and others containing millions of points. Although the table only has thousands of rows, there are two issues that need to be addressed.

  1. Deleting rows from this table is very slow and can take several minutes.
  2. Queries using the "st_intersects" function are also slow and take nearly one minute to complete.
create table public.routed_way
(
    id              uuid default gen_random_uuid() not null primary key,
    created_at      timestamp with time zone default CURRENT_TIMESTAMP,
    updated_at      timestamp with time zone,
    sourcer_class   text not null,
    sourcer_id      text  not null,
    route_start     timestamp with time zone,
    route_end       timestamp with time zone not null,
    way             geometry(LineString, 4326) not null,
    time_delta      integer[],
    elevation_delta integer[]
);
alter table public.routed_way owner to postgres;
create index idx_routed_way_created_at    on public.routed_way (created_at);
create index idx_routed_way_sourcer_class on public.routed_way (sourcer_class);
create index idx_routed_way_route_start   on public.routed_way (route_start);
create index idx_routed_way_way           on public.routed_way using gist (way);
select 
    id, 
    route_start, 
    route_end, 
    st_transform(way, 3857) as geometry
from routed_way
where 
    (route_start > now() - interval '7 day' 
    and route_start < now() - interval '3 day')
  and st_intersects(st_transform(way, 3857),!bbox!)
Gather  (cost=1072.66..63827.33 rows=1 width=64)
  Workers Planned: 1
  ->  Parallel Bitmap Heap Scan on routed_way  (cost=72.66..62827.23 rows=1 width=64)
        Recheck Cond: ((route_start > (now() - '7 days'::interval)) AND (route_start < (now() - '3 days'::interval)))
"        Filter: st_intersects(st_transform(way, 3857), '0103000020110F00000100000005000000010000E0FFFFEFC7010000E0FFFFEFC7010000E0FFFFEFC7010000E0FFFFEF47010000E0FFFFEF47010000E0FFFFEF47010000E0FFFFEF47010000E0FFFFEFC7010000E0FFFFEFC7010000E0FFFFEFC7'::geometry)"
        ->  Bitmap Index Scan on idx_routed_way_route_start  (cost=0.00..72.66 rows=2037 width=0)
              Index Cond: ((route_start > (now() - '7 days'::interval)) AND (route_start < (now() - '3 days'::interval)))

I am unsure what values Mapnik is using for the !bbox! so I attempted to use some values in st_makeenvelope(), but according to the EXPLAIN output, it seems that the desired index was not applied.

Explain analyze verbose select
    id,
    route_start,
    route_end,
    st_transform(way, 3857) as geometry
from routed_way
where
        (route_start > now() - interval '7 day'
  and route_start < now() - interval '3 day')
  and st_intersects(way,ST_MakeEnvelope(-180, -90, 180, 90, 4326));
Bitmap Heap Scan on public.routed_way  (cost=35.31..93324.71 rows=1856 width=64) (actual time=5.529..7573.988 rows=1848 loops=1)
"  Output: id, route_start, route_end, st_transform(way, 3857)"
  Recheck Cond: ((routed_way.route_start > (now() - '7 days'::interval)) AND (routed_way.route_start < (now() - '3 days'::interval)))
"  Filter: st_intersects(routed_way.way, '0103000020E6100000010000000500000000000000008066C000000000008056C000000000008066C0000000000080564000000000008066400000000000805640000000000080664000000000008056C000000000008066C000000000008056C0'::geometry)"
  Heap Blocks: exact=345
  ->  Bitmap Index Scan on idx_routed_way_route_start  (cost=0.00..34.85 rows=1856 width=0) (actual time=0.091..0.092 rows=1848 loops=1)
        Index Cond: ((routed_way.route_start > (now() - '7 days'::interval)) AND (routed_way.route_start < (now() - '3 days'::interval)))
Planning Time: 0.143 ms
Execution Time: 7574.252 ms
postgres=> \d+ routed_way
                                                   Table "public.routed_way"
     Column      |           Type            | Collation | Nullable |      Default      | Storage  | Stats target | Description 
-----------------+---------------------------+-----------+----------+-------------------+----------+--------------+-------------
 id              | uuid                      |           | not null | gen_random_uuid() | plain    |              | 
 created_at      | timestamp with time zone  |           |          | CURRENT_TIMESTAMP | plain    |              | 
 updated_at      | timestamp with time zone  |           |          |                   | plain    |              | 
 sourcer_class   | text                      |           | not null |                   | extended |              | 
 sourcer_id      | text                      |           | not null |                   | extended |              | 
 route_start     | timestamp with time zone  |           |          |                   | plain    |              | 
 route_end       | timestamp with time zone  |           | not null |                   | plain    |              | 
 way             | geometry(LineString,4326) |           | not null |                   | external |              | 
 time_delta      | integer[]                 |           |          |                   | extended |              | 
 elevation_delta | integer[]                 |           |          |                   | extended |              | 
Indexes:
    "routed_way_pkey" PRIMARY KEY, btree (id)
    "idx_routed_way_created_at" btree (created_at)
    "idx_routed_way_route_start" btree (route_start)
    "idx_routed_way_sourcer_class" btree (sourcer_class)
    "idx_routed_way_way" gist (way) CLUSTER
Referenced by:
    TABLE "routed_point" CONSTRAINT "fk_routed_point_way_id" FOREIGN KEY (way_id) REFERENCES routed_way(id) ON UPDATE CASCADE ON DELETE CASCADE
Access method: heap
Zegarek
  • 6,424
  • 1
  • 13
  • 24
kolisko
  • 1,548
  • 3
  • 17
  • 22
  • 1
    Use `gist` index to speed up the `st_intersects()`. `create index on public.routed_way using gist(way)`. For that purpose it's better to keep the geometry-pre-transformed tot he right SRID. – Zegarek Feb 12 '23 at 11:29
  • @zegarek I apologize, but an index has already been applied to the table. I just inserted an outdated DDL here. I have updated the DDL, as seen above. – kolisko Feb 12 '23 at 12:02
  • 1
    The fact that you're using `st_transform()` on the indexed column before it's processed by `st_intersects()` could be what's stopping it from using the index. See my answer. – Zegarek Feb 12 '23 at 12:04

3 Answers3

3

According to what you initially presented, you have no index on the geometry type column way. That's causing the heap scan.

  1. You need to apply a GiST index on the column to speed up all PostGIS ST_ operations on it.
  2. Speed things up further by skipping the TOAST mechanism - set storage external.
  3. cluster using that index
  4. Make sure the table statistics are up to date with analyze
  5. If the traffic on your table isn't high, you can increase the fillfactor of your index.
alter table public.routed_way
   alter column way
   set storage external;
create index idx_routed_way_gist_way 
   on public.routed_way 
   using gist(way) 
--Default is 90. 100 is for static, read-only.
   with (fillfactor=100); 
cluster verbose public.routed_way using idx_routed_way_gist_way;
analyze public.routed_way;

You are also using ST_Transform on the whole way column instead of just the !bbox! geometry you're intersecting with, which seems counter-productive. If you reverse that, it will have to be done just once per statement instead of once per target table record:

select 
    id, 
    route_start, 
    route_end, 
    st_transform(way, 3857) as geometry
from routed_way
where 
    (route_start > now() - interval '7 day' 
    and route_start < now() - interval '3 day')
  and st_intersects(way,st_transform(!bbox!,4326));

If your delete statements were using a similar, geometry-based where clause, you can speed them up the same way.

In principle, the larger your geometries get the less useful your index will be: if the bbox of your geometry spans the entire target area, it's assumed to be spatially related to everything. You can try and use ST_Segmentize() to cut it up into smaller segments with smaller bounding boxes, which will speed everything up dramatically, similar to how ST_Subdivide() is used (on all the Things).

In an extreme case, if the !bbox! geometry you're comparing to is large enough, you could consider st_subdividing it it into a temp table, indexing it, then joining with that table on st_intersects(way_segmentized,bbox_subdivided).

Subdivision and segmentation of your geometries also speed things up not just because it's easier to compare smaller objects (same way it's easier to process shorter texts and smaller numbers) and easier to discard and ignore ones that are far enough to be dismissed (r-tree index), but this also decreases the need/chance of TOASTing them out of their page.


Except set storage external and making sure your stats are up to date (other indexes you use benefit from that as well), none of the above will improve the performance if the query targets everything in the table. Rearranging how your data is stored, accessed and processed can help selective queries get their target records faster, but it doesn't affect your underlying storage medium read speed that will dictate your response time if you ask the db to read the whole thing.

Any sufficiently non-specific filter, in your case a large !bbox! geometry, like the global ST_MakeEnvelope(-180,-90,180,90,4326) you mentioned, will ignore your spatial indexes because it's immediately obvious they match everything in them, so they might just as well not be considered at all.

Zegarek
  • 6,424
  • 1
  • 13
  • 24
  • I apologize, but an index has already been applied to the table. I just inserted an outdated DDL here. I have updated the DDL, as seen above. – kolisko Feb 12 '23 at 12:06
  • @kolisko I've noticed and responded. Most of my other points still apply. – Zegarek Feb 12 '23 at 12:12
  • Thank you for your advice, I have run the SQL but I'm not sure how to confirm that the change was applied. When I view the table's DDL, I don't see any changes. – kolisko Feb 12 '23 at 15:51
  • 1
    In psql you can run `\d+ routed_way` to list the table config in detail. You can also look how the output of `Explain analyze verbose` on the version of your query suggested above differs from the one you showed in the question. – Zegarek Feb 12 '23 at 15:53
  • Thank you, I followed your recommendations, but unfortunately it doesn't seem like the change took effect – kolisko Feb 12 '23 at 18:22
  • I have added more detail on how the query is behaving now, after the changes have been made. See "EDIT 1" – kolisko Feb 12 '23 at 18:36
  • 1
    Even if these changes did anything, `st_intersects(way,ST_MakeEnvelope(-180, -90, 180, 90, 4326))` reads as *"give me all geometries that intersect the entire surface of the globe"*. The planner likely sees that there's no point in using the index if the condition says it concerns *everything* in the table. There's a flat 0% chance the index speeds anything up so it just skips it and goes directly for the heap. After all, if your query asks for everything, the performance will be driven by how fast the db can read and send it. – Zegarek Feb 12 '23 at 19:19
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/251817/discussion-between-zegarek-and-kolisko). – Zegarek Feb 12 '23 at 19:19
0

In addition to @Zegarek answer, if you can't control the bounding box intersection clause because it is automatically added by your map server, you can also create an index on the transformed ways:

create index idx_routed_way_way_3857
    on public.routed_way using gist (st_transform(way,3857));
JGH
  • 15,928
  • 4
  • 31
  • 48
-1

Having so large geometry in one record is almost always a bad choice.

Try splitting large geometry (e.g. < 1000 points and < 500 m length) and you will see much more predictable and better performance.

See example of using ST_Segmentize() and ST_Subdivide() here - https://postgis.net/docs/ST_Subdivide.html

SELECT ST_Subdivide( 
         ST_Segmentize( 'LINESTRING(0 0, 85 85)'::geography
                      , 1200000 -- max segment length, m
                      ) ::geometry
         , 8 -- max vertices
        );
mikkapy
  • 254
  • 2
  • 4