Being thrown into the DBA deep end, I am trying to wrap by head around pgBouncer and PostgreSQL. For a little bit of context, we have deployed a Moodle LMS with PostgreSQL on Google Cloud. The VM (16VCPU, 64GB RAM) and Database (8VCPU, 32GB RAM) are separated and so far we have been able to server over 5000 students without issues.
Recently we have been running real time quizzes on the platform and we started getting this error:
[1-1] db=moodledb,user=postgres FATAL: remaining connection slots are reserved for non-replication superuser connections"
I noticed that our maximum database connections had exceeded the Cloud SQL default of 600. That was time to put the app into maintenance mode. After some digging, I realized that the quiz (100 questions, 10 pages, 10 questions per page) was accommodating 600 students concurrently. They all logged into the system at the same time, essentially leading it to crash. The spike went from 43 connections to the 600 in a matter of seconds.
As we still have more exams to go, with the largest quiz having a maximum 867 students, I am well aware that I need to prepare for this. We have already discussed the possibility of spreading out the quiz start times so that we have less load at the time the quiz starts but I need to be ready for anything as I need to save face.
Therefore, a lot of forums pointed to pgBouncer to manage connection pooling given the nature of how PostgreSQL spawns new connections. I went ahead to setup an instance in between Moodle and PostgreSQL. After a few tests and configuration tests, I was able to stop the max_connections
error above, but pgBouncer gave me a new set of issues. Below is an extract of some of the important settings from my ini
file:
;; When server connection is released back to pool:
;; session - after client disconnects (default)
;; transaction - after transaction finishes
;; statement - after statement finishes
pool_mode = session
;; Total number of clients that can connect
;; We don't expect more than this number at any one time
max_client_conn = 5000
;; Default pool size. 20 is good number when transaction pooling
;; is in use, in session pooling it needs to be the number of
;; max clients you want to handle at any moment
;; And since Moodle LMS uses session as the pool type
;; I thought this value is the same as the postgresql max_connections
default_pool_size = 600
;; Minimum number of server connections to keep in pool.
;min_pool_size = 0
; how many additional connection to allow in case of trouble
;reserve_pool_size = 0
;; If a clients needs to wait more than this many seconds, use reserve
;; pool.
;reserve_pool_timeout = 5
;; How many total connections to a single database to allow from all
;; pools
;max_db_connections = 0
;max_user_connections = 0
;; If off, then server connections are reused in LIFO manner
;server_round_robin = 0
With this configuration, I get another set of errors below:
ERROR accept() failed: Too many open files
closing because: client close request (age=160s)
closing because: client close request (age=90s)
WARNING C-0x55c61cc57780: (nodb)/(nouser)...pooler error: no more connections allowed (max_client_conn)
closing because: no more connections allowed (max_client_conn) (age=0s)
Those errors fill up the logs. So here is the question. Unfortunately I am quite new to database administration but I can easily find my way around. Is there anything I need to look at in my configuration that can make things run a little more smoother? Do I need to increase any resources like the Database specs? Is there a formula for coming up with those configuration values for pgBouncer? I would appreciate any help and direction in getting this sorted. Thank you!