I'm looking for more detailed guidance / other people's experience of using Npgsql in production with Pgbouncer.
Basically we have the following setup using GKE and Google Cloud SQL:
Right now - I've got npgsql configured as if pgbouncer wasn't in place, using a local connection pool. I've added pgbouncer as a deployment in my GKE cluster as Google SQL has very low max connection limits - and to be able to scale my application horizontally inside of Kubernetes I need to protect against overwhelming it.
My problem is one of reliability when one of the pgbouncer pods dies (due to a node failure or as I'm scaling up/down).
When that happens (1) all of the existing open connections from the client side connection pools in the application pods don't immediately close (2) - and basically result in exceptions to my application as it tries to execute commands. Not ideal!
As I see it (and looking at the advice at https://www.npgsql.org/doc/compatibility.html
) I have three options.
Live with it, and handle retries of SQL commands within my application. Possible, but seems like a lot of effort and creates lots of possible bugs if I get it wrong.
Turn on keep alives and let npgsql itself 'fail out' relatively quickly the bad connections when those fail. I'm not even sure if this will work or if it will cause further problems.
Turn off client side connection pooling entirely. This seems to be the official advice, but I am loathe to do this for performance reasons, it seems very wasteful for Npgsql to have to open a connnection to pgbouncer for each session - and runs counter to all of my experience with other RDBMS like SQL Server.
Am I on the right track with one of those options? Or am I missing something?