0

Being thrown into the DBA deep end, I am trying to wrap by head around pgBouncer and PostgreSQL. For a little bit of context, we have deployed a Moodle LMS with PostgreSQL on Google Cloud. The VM (16VCPU, 64GB RAM) and Database (8VCPU, 32GB RAM) are separated and so far we have been able to server over 5000 students without issues.

Recently we have been running real time quizzes on the platform and we started getting this error:

[1-1] db=moodledb,user=postgres FATAL:  remaining connection slots are reserved for non-replication superuser connections"

I noticed that our maximum database connections had exceeded the Cloud SQL default of 600. That was time to put the app into maintenance mode. After some digging, I realized that the quiz (100 questions, 10 pages, 10 questions per page) was accommodating 600 students concurrently. They all logged into the system at the same time, essentially leading it to crash. The spike went from 43 connections to the 600 in a matter of seconds.

As we still have more exams to go, with the largest quiz having a maximum 867 students, I am well aware that I need to prepare for this. We have already discussed the possibility of spreading out the quiz start times so that we have less load at the time the quiz starts but I need to be ready for anything as I need to save face.

Therefore, a lot of forums pointed to pgBouncer to manage connection pooling given the nature of how PostgreSQL spawns new connections. I went ahead to setup an instance in between Moodle and PostgreSQL. After a few tests and configuration tests, I was able to stop the max_connections error above, but pgBouncer gave me a new set of issues. Below is an extract of some of the important settings from my ini file:

;; When server connection is released back to pool:
;;   session      - after client disconnects (default)
;;   transaction  - after transaction finishes
;;   statement    - after statement finishes
pool_mode = session

;; Total number of clients that can connect
;; We don't expect more than this number at any one time
max_client_conn = 5000

;; Default pool size.  20 is good number when transaction pooling
;; is in use, in session pooling it needs to be the number of
;; max clients you want to handle at any moment
;; And since Moodle LMS uses session as the pool type
;; I thought this value is the same as the postgresql max_connections
default_pool_size = 600

;; Minimum number of server connections to keep in pool.
;min_pool_size = 0

; how many additional connection to allow in case of trouble
;reserve_pool_size = 0

;; If a clients needs to wait more than this many seconds, use reserve
;; pool.
;reserve_pool_timeout = 5

;; How many total connections to a single database to allow from all
;; pools
;max_db_connections = 0
;max_user_connections = 0

;; If off, then server connections are reused in LIFO manner
;server_round_robin = 0

With this configuration, I get another set of errors below:

ERROR accept() failed: Too many open files
closing because: client close request (age=160s)
closing because: client close request (age=90s)
WARNING C-0x55c61cc57780: (nodb)/(nouser)...pooler error: no more connections allowed (max_client_conn)
closing because: no more connections allowed (max_client_conn) (age=0s)

Those errors fill up the logs. So here is the question. Unfortunately I am quite new to database administration but I can easily find my way around. Is there anything I need to look at in my configuration that can make things run a little more smoother? Do I need to increase any resources like the Database specs? Is there a formula for coming up with those configuration values for pgBouncer? I would appreciate any help and direction in getting this sorted. Thank you!

realnsleo
  • 709
  • 2
  • 12
  • 29
  • 2
    What pool mode are you using? It sounds like your application is leaking connections, that is, it does not close database transactions or connections. – Laurenz Albe Oct 01 '21 at 07:01
  • @LaurenzAlbe that;'s what I thought as well. I am using `session` mode because Moodle maintains session state in it's transactions. I thought it was a leaking issue and so I removed most of the 3rd party plugins I am not sure of but it still gave the same issue. – realnsleo Oct 01 '21 at 07:04
  • You will have to fix the application code. – Laurenz Albe Oct 01 '21 at 07:21
  • If it keeps state "in it's transactions", then why wouldn't you be able to use transaction pooling? – jjanes Oct 01 '21 at 15:50
  • What version of pgbouncer are you using? – jjanes Oct 01 '21 at 15:56
  • 1
    I would say these are more questions about moodle than they are about pgbouncer. – jjanes Oct 01 '21 at 15:59
  • Hi @jjanes, you're right. This is more a Moodle problem due to the nature of their queries. I am using pgBouncer 1.15.0. I also noticed some of the queries take long to execute and are timed out, leading to unclosed connections. Meaning I have to increase certain PostgreSQL configurations like `shared_buffers` and `work_mem`. Thanks! I will keep digging. – realnsleo Oct 01 '21 at 16:10
  • Maybe you could capture some slow queries and see if they can be tuned (posting them here as new questions with `EXPLAIN` or `EXPLAIN (ANALYZE, BUFFERS) ` if you want help with that) – jjanes Oct 01 '21 at 17:26

0 Answers0