I am having issues with a large query, that I expect to rely on wrong configs of my postgresql.config. My setup is PostgreSQL 9.6 on Ubuntu 17.10 with 32GB RAM and 3TB HDD. The query is running pgr_dijkstraCost
to create an OD-Matrix of ~10.000 points in a network of 25.000 links. Resulting table is thus expected to be very big ( ~100'000'000 rows with columns from, to, costs). However, creating simple test as select x,1 as c2,2 as c3
from generate_series(1,90000000)
succeeds.
The query plan:
QUERY PLAN
--------------------------------------------------------------------------------------
Function Scan on pgr_dijkstracost (cost=393.90..403.90 rows=1000 width=24)
InitPlan 1 (returns $0)
-> Aggregate (cost=196.82..196.83 rows=1 width=32)
-> Seq Scan on building_nodes b (cost=0.00..166.85 rows=11985 width=4)
InitPlan 2 (returns $1)
-> Aggregate (cost=196.82..196.83 rows=1 width=32)
-> Seq Scan on building_nodes b_1 (cost=0.00..166.85 rows=11985 width=4)
This leads to a crash of PostgreSQL:
WARNING: terminating connection because of crash of another server process
DETAIL: The postmaster has commanded this server process to roll back the
current transaction and exit, because another server process exited
normally and possibly corrupted shared memory.
Running dmesg
I could trace it down to be an Out of memory issue:
Out of memory: Kill process 5630 (postgres) score 949 or sacrifice child
[ 5322.821084] Killed process 5630 (postgres) total-vm:36365660kB,anon-rss:32344260kB, file-rss:0kB, shmem-rss:0kB
[ 5323.615761] oom_reaper: reaped process 5630 (postgres), now anon-rss:0kB,file-rss:0kB, shmem-rss:0kB
[11741.155949] postgres invoked oom-killer: gfp_mask=0x14201ca(GFP_HIGHUSER_MOVABLE|__GFP_COLD), nodemask=(null), order=0, oom_score_adj=0
[11741.155953] postgres cpuset=/ mems_allowed=0
When running the query I also can observe with top
that my RAM is going down to 0 before the crash. The amount of committed memory just before the crash:
$grep Commit /proc/meminfo
CommitLimit: 18574304 kB
Committed_AS: 42114856 kB
I would expect the HDD is used to write/buffer temporary data, when RAM is not enough. But the available space on my hdd does not change during the processing. So I began to dig for missing configs (expecting issues due to my relocated data-directory) and following different sites:
https://www.postgresql.org/docs/current/static/kernel-resources.html#LINUX-MEMORY-OVERCOMMIT https://www.credativ.com/credativ-blog/2010/03/postgresql-and-linux-memory-management
My original settings of postgresql.conf are default except for changes in the data-directory:
data_directory = '/hdd_data/postgresql/9.6/main'
shared_buffers = 128MB # min 128kB
#huge_pages = try # on, off, or try
#temp_buffers = 8MB # min 800kB
#max_prepared_transactions = 0 # zero disables the feature
#work_mem = 4MB # min 64kB
#maintenance_work_mem = 64MB # min 1MB
#replacement_sort_tuples = 150000 # limits use of replacement selection sort
#autovacuum_work_mem = -1 # min 1MB, or -1 to use maintenance_work_mem
#max_stack_depth = 2MB # min 100kB
dynamic_shared_memory_type = posix # the default is the first option
I changed the config:
shared_buffers = 128MB
work_mem = 40MB # min 64kB
maintenance_work_mem = 64MB
Relaunched with sudo service postgresql reload
and tested the same query, but found no change in behavior. Does this simply mean, such a large query can not be done? Any help appreciated.