0

I have a question regarding streaming replication in Postgres.

I have this one DB server that runs Postgres and it's heavy on both reads and writes(runs bulk inserts, large selects, minute to minute etc.). I'm trying to find a strategy to handle the load.

I've seen talk about using streaming replication to another server on the same network to be used as a read-only database for read operations, while sending writes to the master.

My question is: Since the master is write-heavy and creates large amounts of blocks, won't the same happen to the slave via the replication? Is the replication as heavy as writing directly to the connection from the web app?

Also, on my current server (20GB RAM, 11 core) Postgres uses around 6GB which appears to be mostly checkpoint, autovacuum procs, then queries. Load ave. spikes and drops from 2.0 to 16.0 and more.

Don't want to invest the time and money in this unless I have a good reason for doing so.

What is your advice? Should I invest more time in configuration and server or should I do repl as well? My DB and user base is growing every month and will soon need a real scalable setup before things get really slow on the frontend and users get frustrated.

P.S. This isn't a typical CRUD app. My clients connect to their CRM and pull in 100s of thousands of contacts, which are bulk inserted into the db, then later resynced/updated every hour.

  • Can you distribute the database based on some criteria? e.g. location, tenant, country, ...? In that case sharding via foreign data wrappers might help –  Nov 28 '17 at 16:21
  • I don't see that being much of an issue as most of the db interaction goes on inside the sever. The big issue right now is that the workers inside the server are pounding the db, and this slows down reads due to locks and overall server load. I do have table partitioning in place for each user. For example: table ContactsParition122 where 122 is the userid. –  Nov 28 '17 at 16:49
  • I am not talking about partitioning. I am talking about sharding where the data is divided between different _servers_. E.g. the data for Germany is on server one, the data for Austria is on server two and so on. That way the I/O load is also distributed over several servers. But that only works if you can actually find such a "sharding key". –  Nov 28 '17 at 17:41

0 Answers0