62

Let we say you are running your business on top of postgresql database. After some time you get so much traffic that it cannot be handled by single postgresql instance, so you want to add more instances (scale it horizontally) to be able to handle growth.

Your data is relational, so probably switching to some key/value solution is not an option.

How would you do it with postgresql?

PS. Postgresql version: 9.5

CommonSenseCode
  • 23,522
  • 33
  • 131
  • 186
user232343
  • 2,008
  • 5
  • 22
  • 34
  • 1
    Use a Master/Slave replication (with multiple slaves) with a load balancer (e.g. pgPool) http://www.postgresql.org/docs/current/static/high-availability.html –  Jan 16 '16 at 19:22
  • 2
    you can take a look at Postgres-XL to use sharding, but be aware of its limitations http://www.postgres-xl.org/ – Dmitry S Jan 17 '16 at 10:24
  • 1
    @DmitrySavinkov where can I find more info on limitations of Postgres-XL? – ctrl-alt-delete Nov 30 '16 at 14:10
  • 2
    @toasteez we used PostgresXL based on Postgres 9.3 back in 2014. Now they announced v9.5 base for XL, which is much better. In our app we had heavy trigger logic implemented, it was a main blocker for us, so it may not a blocker for your use case. – Dmitry S Nov 30 '16 at 17:31
  • 3
    citusdb has worked pretty well for us so far. – jmunsch Aug 25 '17 at 04:00
  • 1
    @user232343: What did you end up doing? – boggy Nov 14 '17 at 01:32
  • What do you think about using https://www.citusdata.com/ ?) – Maks Oct 25 '20 at 12:14

1 Answers1

62
  1. If it is about read-heavy workload then you should just add replicas. Add as many replicas as you need to handle the whole workload. You can balance all the queries across the replicas in the round robin fashion.

  2. If it is about write-heavy workload, then you should partition your database across many servers. You can put different tables on different machines or you can shard one table across many machines. In the latter case, you can shard a table by a range of the primary key, or by a hash of the primary key, or even vertically by rows. In each of the cases above you may lose transactionality, so be careful, and make sure that all the data that is changed and queried by a transaction resides on the same server.

jonschlinkert
  • 10,872
  • 4
  • 43
  • 50
Dennis Anikin
  • 961
  • 5
  • 7
  • 1
    Could you define "transactionality", when you say _you may lose transactionality_? – alelom Oct 10 '21 at 17:16
  • 2
    @alelom I think by _you may lose transactionality_ he means that you may lose the ACID guarantees that transactions typically afford you: https://en.wikipedia.org/wiki/ACID – Magne Feb 12 '22 at 13:51