2

I want to host a web site, the back end will be a PostgreSQL database server. I would like to have multiple web servers, in different data centers for redundancy. I would then round robin DNS each web server to distribute the load.

To do this, should I also have multiple databases and use some sort of replication? Or would it be better to have one database and have each web server point to it over the internet?

Update 1

The idea was brought up that multiple web servers should have multiple DB servers, to eliminate single point of failure. I would really like to have a duplicate set up at each data center. I would have a web server, and a DB server at each data center. I was not sure if this was possible, so I stayed open to the idea of having only one DB server, or have all DBs at one location. But ideally, each data center would have its own.

Anthony D
  • 145
  • 1
  • 8
  • 1
    Well, if you really have the need for redundancy then you would want multiple database servers as well. Otherwise, the single database is another single point-of-failure. – Charles Hooper Oct 07 '09 at 14:15

2 Answers2

3

I have recently tried to figure that out as well for a similiar project. What I did find is that it is not that easy. There are several open-source and commercial alternatives when it comes to replication. However, most of them doesn't support multiple master-master solutions, and quite a lot of them are hard to maintain and setup.

You have some alternatives to go forward, depending on how your web-application is setup and how much redundancy you need. The easiest one is probably doing a warm standby (that functionality is already included in version 8.3), but in a event of a crash, you will have a downtime.

Other replication solutions that I have looked at include:

Out of those, I found Bucardo the most interesting since it supports master-master (not multi though), and seems to be quite easy to setup.

Slony is a bit hard to setup and also requires a lot of maintanance when updating tables (DDL changes are not replicated), and is also only a master-slave, which means that you can only write to one database.

PgCluster doesn't seem to be maintained and doesn't really give a serious impression with the documentation and website (the website haven't been updated since 2005).

Jimmy Stenke
  • 196
  • 3
1

Doing global round robin via DNS can cause your connections to be non-persistant. If your code isn't node aware, you should expect writes on any of the replicated databases.

You can do a multi-master replication setup with a couple of solutions. I've used PgCluster before with good success. There's also an interesting looking project over at Postgres-R.

Keep in mind consistency is your enemy here. If you write to one node, then read from another, your request may beat your commit to the other database.

If the bulk of your queries are reads, you could set up a normal replication scheme, and then have your code do writes to a central master database. It's more work, but it makes the consistency issue easier to deal with. You could use something like Slony to do 'normal' replication.

Edit: I agree with the comments about PgCluster being hard to set up and having bad docs! I didn't link to them as their site is down as I write this.

kordless
  • 155
  • 4