1

Currently I am working on making the 2 node cluster of PostgreSQL on bare metal cloud. I am very confused about either which approach should I go.

Like i have one option that is PostgreSQL BDR (bi directional replica). In this approach, I have benefit that my both nodes will have read and write access. but now I came to know about PostgreSQL XL. This approach works on sharding approach. Can anybody tell me or help to that either which approach should I go? Sharding will give me benefit or not? I want my Postgres highly available and fast. Which approach will help me in this regard.

Or any other suggestions that you wanna give me.

One more thing. I want to make my cluster horizontal scalable.

Amad
  • 314
  • 1
  • 6
  • 25
  • 3
    Setting up and managing a cluster with BDR or Postgres-XL is not something for a beginner. I would suggest you start simple with a standard master/replication with a hot standby using the built-in tools, using pgpool or pgbouncer to do the load balancing. That will give you high availability and horizontal read scaling. Tools like repmgr can help you to quickly provision new replicas. Especially if you are new to Postgres (or even relational databases) start small - scale later. Don't underestimate Postgres ability to perform quite well on a single (powerful) node –  Oct 12 '17 at 06:25
  • but i want both nodes to have write write access. which hot standby can not provide me – Amad Oct 12 '17 at 06:48
  • Why do you think you need that? How many (write) transactions per second do you expect? What kind of application is that? And why do you think a single node wouldn't be able to handle that? –  Oct 12 '17 at 06:51
  • minimum 1 million concurrent users will be accessing DB and they will be writing some things and in most cases searching queries will be executing. – Amad Oct 12 '17 at 07:22
  • 1
    I doubt that you have 1 million concurrent **database** users. If you have that many concurrent _application_ users you absolutely need a connection pool (for _any_ database, not just Postgres). We manage online shops with something like 2000 page impressions per second on a simple master/slave (two node) setup. Don't overcomplicate things from the start. –  Oct 12 '17 at 07:27
  • okay. can you please tell me some steps, which i should follow to start. like you know i need highly available cluster of postgres. what should i do now? give me some key points please on which should i work – Amad Oct 12 '17 at 07:36
  • https://www.postgresql.org/docs/current/static/high-availability.html to simplify the management of the standby servers you can use: http://repmgr.org/ –  Oct 12 '17 at 07:37
  • 1
    and even more important than HA is DR and **backups**. PgBarman, PgBackrest, OmniPITR, WAL-E, etc etc – Craig Ringer Oct 12 '17 at 08:25
  • You can deal with concurrent user loads with layered caching etc too, remember. – Craig Ringer Oct 12 '17 at 09:03

1 Answers1

2

The best solution in most cases is option (c): neither. Use stock PostgreSQL + active/standby failover.

I say that as a BDR developer. It's a great tool (in my opinion) for workloads that need it. But it comes with some considerable costs, like any multi-master system, and should not be used if you don't actually need it.

Most people who think they need multi-master, don't. Or rather, don't understand the impacts and trade-offs.

Read the BDR documentation on multi-master conflicts.

Craig Ringer
  • 307,061
  • 76
  • 688
  • 778
  • what if i need horizontal scalability in future? than what will i do? – Amad Oct 12 '17 at 07:20
  • 3
    Deal with it in future. What if you need a 100TB storage array in future? Do you buy one now? – Craig Ringer Oct 12 '17 at 07:35
  • no i can't buy now. that the reason i am trying to find optimal solution for my use case. – Amad Oct 12 '17 at 07:41
  • https://en.wikipedia.org/wiki/Program_optimization . https://rachelandrew.co.uk/archives/2012/03/21/stop-solving-problems-you-dont-yet-have/ . But also http://ubiquity.acm.org/article.cfm?id=1513451 . – Craig Ringer Oct 12 '17 at 07:47
  • 1
    Also read about PACELC and CAP - see https://brooker.co.za/blog/2014/07/16/pacelc.html . Multimaster isn't magic high availability sauce for your system. It introduces other problems. Any vendor who tells you otherwise is lying, or has technology that can violate the speed of light. – Craig Ringer Oct 12 '17 at 07:55