0

I have a queer situation. I am managing an e-commerce site built on Django with Postgresql. It has two versions - English and Japanese. Because of a release that has brought a huge number of users, the site (specifically Postgres) is overloaded and crashing. The only safe solution which I can think of is to put these two separately on two separate servers so that En and Jp traffic gets their own dedicated server. Now, the new server is ready but during the time of domain propagation, and during half-propagated stages (new one being seen from some countries and old one from some) there will be transactions on both. Users are buying digital stuff in hundreds of numbers every minute. So, there is no way to turn the server off for a turnover.

Is there a way to sync the two databases at a later stage (because if both share a database, the new server will be pointless). The bottleneck is Postgres, and has already been tuned for maximum possible connections on this server, and kernel.shmmax is at its limit. DB pooling also will need time to setup and some downtime as well, which am not permitted to do at the moment. What I mean by sync is that once full propagation occurs, I wish to unify the DB dump files from both and make one which has all records of both synced in time. The structure is rather complex so many tables will need sync. Is this do-able ..?

Thanks in advance !

Nikhil Khullar
  • 703
  • 6
  • 21
  • 1
    "tuned for maximum possible connections on this server" it is generally a bad idea to increase connection limit more than 10-20. Use connection pool (like pgbouncer) instead. Consider this article: http://wiki.postgresql.org/wiki/Number_Of_Database_Connections – Ihor Romanchenko Aug 12 '14 at 11:08
  • @IgorRomanchenko Thanks for the reply. Yeah. Am aware of that ! This is just a fix for the campaign period of one week. But, implementing pgbouncer needs downtime, because Django is being used. That's why there is a need to sync the database. Is there any way to sync two identical Postgres DBs with some different data. So, that recent entries from both get merged chronologically ? – Nikhil Khullar Aug 12 '14 at 11:10
  • "So, that recent entries from both get merged chronologically" I do not know a tool, that can do this automatically. There are too many problems with DB and businesses logic consistency to do this automatically. – Ihor Romanchenko Aug 12 '14 at 11:13
  • I rarely see a system that's overloaded that doesn't turn out to have (a) vacuum / bloat problems; (b) poor index choices; (c) poor checkpoint and autovacuum configuration; (d) really inefficient queries and/or (e) massively too many `max_connections` causing thrashing. Splitting it might help you kick the problem down the road a little, but the correct thing to do is usually a mixture of proper connection pooling (e.g. pgbouncer), tuning, and fixing bloat. In genuinely high load cases adding a caching layer can be valuable. – Craig Ringer Aug 12 '14 at 13:51
  • BTW, "The bottleneck is Postgres, and has already been tuned for maximum possible connections on this server, and kernel.shmmax is at its limit" says that your system is almost certainly thrashing very inefficiently. You *must* put a connection pool in place instead. Downtime should be utterly trivial, as pgbouncer in transaction pooling is effectively transparent to most apps. – Craig Ringer Aug 12 '14 at 13:52
  • As for the sync: the short version is "no, you'll have to write something that understands your application's schema and data". And you'll need a downtime window to do the sync, too... – Craig Ringer Aug 12 '14 at 13:54

0 Answers0