3

I have a quite large PostgreSQL database with up to 300 connected clients from a web application and some background processing.

I'm considering adding PGbouncer as it is my understanding, that postgresql will not continue to scale well with this many connections due to connection overhead.

Background

My web application is a multi tenancy application. E.g. I have German version of my site at www.my-app.de and a US version at www.my-app.com. Data for each country is split into separate SCHEMAS in postgresql. So per each HTTP request the web app switches between data sets in PostgreSQL by starting the request by setting the search_path like so: if www.my-app.com then SET search_path = 'us'. This is really convenient from an application perspective, however, I think this collides with the possibilities of doing connection pooling?

The problem

So in each HTTP request the search_path is set for the entire request and in the request there might be many queries to the database. It is my understanding, that if I had PGbouncer doing connection pooling, I could risk the first query in a request going to a connection that has search_path set to us and the next query to a connection that has search_path set to de.

Question

Is there any way to avoid this behaviour with PGbouncer? Or maybe an alternative pattern I could use for reducing the connection overhead?

Niels Kristian
  • 8,661
  • 11
  • 59
  • 117
  • It seems that pgBouncer will not help you in this case. check this out: https://github.com/pgbouncer/pgbouncer/pull/73#issuecomment-143957485 – sharez Dec 18 '18 at 13:02
  • I know, its a bad suggestion, how about using two pgBouncer instances one for search_path = 'us' another one for 'de'? – LongBeard_Boldy Dec 19 '18 at 14:21

1 Answers1

4

There is a nice list of PgBouncer limitations on PostgreSQL Wiki. SET is on that list.

Assuming you want pool_mode = transaction in PgBouncer - which is the most sensible mode, and you have some application code dependent on session settings (like search_path), the only way to retain session state is to use transactions.

If your app just creates a connection to PgBouncer, runs SET search_path TO us on it, and then runs multiple SELECTs - it will not work.

This is all easy to prove - just connect with psql to PgBouncer and use SET command:

(postgres.example.com:6432) prod=# SET client_min_messages TO debug;
SET
(postgres.example.com:6432) prod=# SHOW client_min_messages ;
 client_min_messages 
---------------------
 notice
(1 row)

As we can see, session state does not persist. You need transaction:

(postgres.example.com:6432) prod=# begin;
BEGIN
(postgres.example.com:6432) prod=# SET client_min_messages TO debug;
SET
(postgres.example.com:6432) prod=# SHOW client_min_messages ;
 client_min_messages 
---------------------
 debug
(1 row)

(postgres.example.com:6432) prod=# COMMIT;
COMMIT

In other words - to enable transaction pooling, transactions must be stateless - quite the same way as HTTP requests, which carry all state information within (in the form of cookies, session IDs etc).

Obviously in your case this means modifications in the app code. Adding a "wrapper" around each call. That's kind of ugly, and if you have to go down that route, consider rewriting the app to use country code explicit, as a parameter or database column.

filiprem
  • 6,721
  • 1
  • 29
  • 42