Your latency experiences are somewhat worrying. In my own testing on my local, fast network I've noticed a few differences between Mongo & MySQL when it comes to latency:
- MySQL request round-trip-time is frequently under 5ms for small items. Sometimes as low as 2.
- MongoDB request RTT is about 3x slower, around 15ms.
Part of the MongoDB time was due to TCP connection setup time, where the MySQL was using a pre-existing (pooled) connection. In both cases the database was not replicated or sharded, so these can be considered best-case (for my network).
Mongo replica-sets can help you here, but only if your application is able to tolerate loose convergence1. In order to made maximal speed, you'll have to configure your mongo writes to only return when the mongodb server reports that it has received the write, and configure your app-servers to only use the AZ-local Mongo instance. Reads will need to use SlaveOK so they can read from that AZ-local replica, this will show local-writes as well as replicated-writes from the other nodes that have been converged so far. This means that each AZ will have a slightly different view of the entire database; the past X minutes will just have local changes, but the deep history will be converged.
This setup will provide low (same-AZ) latency between app-server and DB server. However, the data-view from the app-servers will be different depending on the AZ being hit by your app-consumers. Whether or not this architecture is tolerable for your application can only be decided by you.
However, there is a very big problem with this: MongoDB doesn't support multi-master2 replication, all writes must go to the single Master.
It is currently (v2.2) not possible to configure MongoDB to allow writes to slaves, so all writes in your "write heavy" application will have to go to the single master of the replica-set. You don't mention if read latency is an issue, but if it is, then SlaveOK reads will grab the local Mongo replica-member; but unlike above it may not have received all of the updates from the master yet so there will definitely be a lag between write-submit and when it shows up the local slave.
There are a few different write-types for Mongo. The default returns OK as soon as the Mongo server has fully received the write. Next step up is the mode where it only returns OK when it has comitted the write to the Journal. And the most paranoid (and therefore slowest by far in a mongo with replicas) is only returns OK when a specified number of replicas report the write is in their journals. The default mode is fastest, but the last mode ensures the local replicas have the write (strict consistency).
If that master is not in the same AZ as the app-servers, the latency may very well be unworkable for you even with the default write style. If this is the case, mongo will not work for you as your app exists right now. you're going to have to think real hard about how to change your application to be less latency-sensitive on writes, or use a non-Mongo database that can do multi-master with loose convergence.
The closest Mongo gets to a multi-master configuration is through sharding. If your app-servers are aware of their geo-location, you can include the geo data into the Mongo Shard-Key. Then, when you connect to the MongoS for writing, all the writes go to the local shard's replica-set. Reads can poll the entire database (and will be corespondingly slow when drawing from the non-local shards), so this will maintain consistency. However, this depends entirely on location being your shard-key.
1: Loose Convergence, the time for a distributed or replicated database to come to a uniform state is the convergence time. Loose convergence is a long interval. Tight convergence is a short interval.
2: Multi-master, A database where more than one replica can accept writes. Examples of databases that can do this are Active Directory, OpenLDAP, and some MySQL configs.