I have a weird issue on my postgresql 9.5 which is running on ubuntu 16.
The framework is Ruby on Rails and it's using pgbouncer
# pgbouncer.ini
[databases]
app_production = host=localhost dbname=app_production port=5432
[pgbouncer]
listen_addr = 127.0.0.1
listen_port = 16432
auth_file = userlist.txt
logfile = pgbouncer.log
pidfile = pgbouncer.pid
admin_users = appuser
max_client_conn = 1000
default_pool_size = 1000
# database.yml
production:
adapter: postgresql
host: 127.0.0.1
database: app_production
username: appuser
password: defaultpassword
encoding: unicode
pool: 1000
port: 16432
# DB Version: 14
# OS Type: linux
# DB Type: web
# Total Memory (RAM): 48 GB
# CPUs num: 12
# Connections num: 1000
# Data Storage: hdd
max_connections = 1000
shared_buffers = 12GB
effective_cache_size = 36GB
maintenance_work_mem = 2GB
checkpoint_completion_target = 0.9
wal_buffers = 16MB
default_statistics_target = 100
random_page_cost = 4
effective_io_concurrency = 2
work_mem = 3145kB
min_wal_size = 1GB
max_wal_size = 4GB
max_worker_processes = 12
max_parallel_workers_per_gather = 4
max_parallel_workers = 12
max_parallel_maintenance_workers = 4
Our app needs to have at least 200 pool because we sometimes run over 200 threads for over several hours or 1 day. While running this kind of background job, our database sometimes goes into recovery mode and terminate all connections. Then after 1 or 2 seconds, all gets normal. But terminated background jobs have problem. I am not sure why this postgresql db goes into recovery mode sometimes.
System ram is 48GB
I've increased max connections based on How to increase the max connections in postgres?
So currently shared_buffers = 8192MB
and kernel.shmmax=1342177280
I've tried to fix any misconfigured postgresql settings using (postgresqltuner.pl)[https://github.com/jfcoz/postgresqltuner] and all are fine now.
I am running background jobs in ruby on rails which involves over 100 concurrent threads and run over 6hrs. Over several hours running, database goes into recovery mode and stop that background job. What I expect is to avoid any database recovery mode issue