Rails/Mongo across multiple different geo-regions

Question

I have a system that by necessity requires physical presence in three or more different locations and I need advice on structuring in such a way that my database stays replicated in a timely manner without horrible latency. I've seen mysql access and replication be incredibly slow when the application server was trying to talk to a node that wasn't physically collocated. In this case I am using mongodb.

The stack is linux/passenger/ruby/rails/mongodb.
The database is write heavy and read light.
The infrastructure is Amazon EC2
The application layer must be physically located in 3 or more different locations. I can't justify this requirement further than it is a requirement. The database, however needn't be located in more than one location if it can be written to quickly from other locations.

From reading mongo's documentation, mongo replication seems like more of a candidate than sharding b/c my datastore is not huge. However I don't see anything that addresses the issue of speed for servers communicating across large distances with potentially high latency.

That is a very challenging problem you have. Your toleration for loose convergence will make a world of difference. — sysadmin1138, Nov 11 '12 at 03:30
@sysadmin1138 I'm gunna have to go ahead and admit to my own ignorance on this one. I have no idea what loose convergence is nor is google helping much other than providing Active Directory documentation. — wmarbut, Nov 11 '12 at 03:44
If you need 3 different physical locations, but are ok with them being in the same city, you can satisfy this by just using 3 different availability zones within the same EC2 region. Availability zones are isolated physically as well as on different network and electrical feeds. They all might be in the same city, but they're not in the same datacenter. Availability zones have private high-speed fiber connections and you'll see only ~1-2ms latency between them. You can lose an AZ while your others still function, and AWS claims more than one AZ has never failed at the same time. — Jason Floyd, Nov 11 '12 at 05:52
*Convergence Time:* The time it takes a distributed or replicated database to come to a uniform state. "Loose" convergence means that time is fairly long, and the application can tolerate inconsistent state; it functions normally if one replica doesn't have all of the updates from the other replicas just yet, and the replicas eventually will converge if left alone for a while. If you require tight convergence *and* low latency, you'll be spending a lot of money. — sysadmin1138, Nov 11 '12 at 13:10
@sysadmin1138 gotcha, makes sense. We should have a fair tolerance for loose convergence. — wmarbut, Nov 11 '12 at 17:01
@JasonFloyd Unfortunately AZ's won't meet requirements for this. My spec specifies the Virginia, California, and Oregon DC's to start with. — wmarbut, Nov 11 '12 at 17:02

sysadmin1138 · Accepted Answer · 2012-11-12T16:59:45.870

Your latency experiences are somewhat worrying. In my own testing on my local, fast network I've noticed a few differences between Mongo & MySQL when it comes to latency:

MySQL request round-trip-time is frequently under 5ms for small items. Sometimes as low as 2.
MongoDB request RTT is about 3x slower, around 15ms.

Part of the MongoDB time was due to TCP connection setup time, where the MySQL was using a pre-existing (pooled) connection. In both cases the database was not replicated or sharded, so these can be considered best-case (for my network).

Mongo replica-sets can help you here, but only if your application is able to tolerate loose convergence¹. In order to made maximal speed, you'll have to configure your mongo writes to only return when the mongodb server reports that it has received the write, and configure your app-servers to only use the AZ-local Mongo instance. Reads will need to use SlaveOK so they can read from that AZ-local replica, this will show local-writes as well as replicated-writes from the other nodes that have been converged so far. This means that each AZ will have a slightly different view of the entire database; the past X minutes will just have local changes, but the deep history will be converged.

This setup will provide low (same-AZ) latency between app-server and DB server. However, the data-view from the app-servers will be different depending on the AZ being hit by your app-consumers. Whether or not this architecture is tolerable for your application can only be decided by you.

However, there is a very big problem with this: MongoDB doesn't support multi-master² replication, all writes must go to the single Master.

It is currently (v2.2) not possible to configure MongoDB to allow writes to slaves, so all writes in your "write heavy" application will have to go to the single master of the replica-set. You don't mention if read latency is an issue, but if it is, then SlaveOK reads will grab the local Mongo replica-member; but unlike above it may not have received all of the updates from the master yet so there will definitely be a lag between write-submit and when it shows up the local slave.

There are a few different write-types for Mongo. The default returns OK as soon as the Mongo server has fully received the write. Next step up is the mode where it only returns OK when it has comitted the write to the Journal. And the most paranoid (and therefore slowest by far in a mongo with replicas) is only returns OK when a specified number of replicas report the write is in their journals. The default mode is fastest, but the last mode ensures the local replicas have the write (strict consistency).

If that master is not in the same AZ as the app-servers, the latency may very well be unworkable for you even with the default write style. If this is the case, mongo will not work for you as your app exists right now. you're going to have to think real hard about how to change your application to be less latency-sensitive on writes, or use a non-Mongo database that can do multi-master with loose convergence.

The closest Mongo gets to a multi-master configuration is through sharding. If your app-servers are aware of their geo-location, you can include the geo data into the Mongo Shard-Key. Then, when you connect to the MongoS for writing, all the writes go to the local shard's replica-set. Reads can poll the entire database (and will be corespondingly slow when drawing from the non-local shards), so this will maintain consistency. However, this depends entirely on location being your shard-key.

1: Loose Convergence, the time for a distributed or replicated database to come to a uniform state is the convergence time. Loose convergence is a long interval. Tight convergence is a short interval.
2: Multi-master, A database where more than one replica can accept writes. Examples of databases that can do this are Active Directory, OpenLDAP, and some MySQL configs.

that is an excellent and thorough answer. Thanks! It looks like we either need to rework the requirements or rework the stack. MySQL master/master replication in a virtualized environment has left a bad taste with the team, so I have a feeling that mongo stays. Cheers! — wmarbut, Nov 11 '12 at 17:09
@wmarbut My instinct is that Mongo is just the taste-untasted in this respect, but it may work. If your testing shows it doesn't, I strongly recommend spending software engineering time creating a multi-master type system. Maybe writes feed into a local DB and publish into a 'master' DB that's then replicated for fast reads, with the app-servers configured to read from either. — sysadmin1138, Nov 12 '12 at 02:28
@wmarbut One option I'd like to mention is how AD handles uniqueness in multi-master. Each 'slave' DB is issued a block of unique IDs from master. Obviously this would require an abstraction layer over Mongo, but it'll help solve one of the trickier problems with multi-master architectures. — sysadmin1138, Nov 12 '12 at 02:32
Thanks for the great advice! Stack exchange is a good place for good info, but this level of depth and expertise is an unusual treat. — wmarbut, Nov 12 '12 at 03:19

Rails/Mongo across multiple different geo-regions

1 Answers1