5

I am loading about 300GB of contour line data in to an postgis table. To speed up the process i read that it is fastest to first load the data, and then create an index. Loading the data only took about 2 days, but now I have been waiting for the index for about 30 days, and it is still not ready.

The query was:

create index idx_contour_geom on contour.contour using gist(geom);

I ran it in pgadmin4, and the memory consumption of the progran has varied from 500MB to 100GB++ since.

Is it normal to use this long time to index such a database?

Any tips on how to speed up the process?

Edit: The data is loaded from 1x1 degree (lat/lon) cells (about 30.000 cells) so no line has a bounding box larger than 1x1 degree, most of then should be much smaller. They are in EPSG:4326 projection and the only attributes are height and the geometry (geom).

Einar
  • 71
  • 5
  • please give some idea on contour.contour structure – Vao Tsun May 18 '17 at 11:48
  • 1
    Did you set `maintenance_work_mem` to a really high value like 1GB? That's the main knob to speed up index creation. Still, the time is surprising. Is the CPU busy or is I/O the bottleneck? – Laurenz Albe May 18 '17 at 19:39
  • I had not set the `maintenance_work_mem`. This might be the problem. I set it now with `ALTER SYSTEM SET maintenance_work_mem = '1GB'; SELCT pg_reload_conf();` I'm not sure if I have to restart my indexing or not fo this to take effect, but I'll let it run over the weekend and restart it on monday if it hasn't helped. Thank you @LaurenzAlbe! - (CPU is not very busy, but I am surprised that memory consumption is this high when maintenance_work_mem is set very low.) – Einar May 19 '17 at 06:52
  • You'd have to restart indexing for the parameter to take effect, but you are right to wait over the weekend. The memory consomption is indeed too high, perhapes a bug with PostGIS or GiST. Check if the system is paging (i.e., lot of I/O on the swap partition). That would explain the excessive duration, and then high `maintenance_work_mem`also won't help. In that case it would be best to ask the PostGIS people for help. – Laurenz Albe May 19 '17 at 07:09

1 Answers1

2

I changed the maintenance_work_mem to 1GB and stopped all other writing to disk (a lot of insert opperations had ANALYZE appended, which took a lot of resources). I now ran in 23min.

Einar
  • 71
  • 5