Why are GlusterFS replicated volumes not recommended for Hosts in different datacenters?

Question

Any tutorial I can find about GlusterFS replicated volumes assumes that both (all) bricks are on the same private network which then also leads to the conclusion they must be in the same datacenter.

e.g. "The problem is when the storage you would like to replicate to is on a remote network, possibly in a different location, GlusterFS does not work very well. This is because GlusterFS is not designed to work when there is a high latency between replication nodes." is a quote from https://github.com/GlusterFS/Notes

Also https://gluster.readthedocs.io/en/latest/Administrator%20Guide/Geo%20Replication/ says that replicated volumes are not meant for geo-replication however the real "geo-replication" mechanism in GlusterFS only creates read-only slaves which won't work in every scenario.

So the question is, why isn't it recommended in general, since I haven't found a single example of replicated volumes for hosts on different networks, yet even different datacenters.

I can also explain why I want to use replicated volumes. I have a vServer (OpenVZ) in a datacenter in Frankfurt, Germany and a second in Nuremburg, Germany. Both have multiple peerings with DE-CIX, Deutsche Telekom, and so on and the latency between the vServers is < 4 ms which in my opinion cannot be considered high latency, whatever the definition for that might be in the case of GlusterFS.

I am running iRedmail services on both servers and MariaDB is replicated in Master-Master replication storing only the mail configuration. The mail storage is on disk and I am using GlusterFS replicated volumes to replicate it. I see no issues so far (mail storage is about 20 GB of e-mails including attachments) and am wondering if I am just lucky or if there are problems I just didn't detect yet. Anyway, I prefer to follow best practices which I didn't do in this case and I am wondering what you think about GlusterFS replicated volumens for hosts in different datacenters and what "high latency" actually means.

score 1 · Answer 1 · answered Aug 13 '17 at 23:03

This issue applies to many types of data stores not just GlusterFS. This is because increased distance increases latency. The recommendation to be on the same subnet is also to reduce latency due to network hops.

In order to maintain data synchronization, the various servers must ensure that all servers have the same view of the data. For data reads, the latency effect is usually not an issue. However, serious data corruption can occur if multiple servers write the same block before they are synchronized. When a data block is being updated it is possible to loose changes, if the block being updated was read before a subsequent update on a different server data will likely be lost.

Locking mechanisms can be used to reduce the risk of corruption. However, distributed locks take longer to obtain and release as latency increases. In this case, latency it the time to complete a round-trip between servers. There are three contributing factors when communicating between data centers.

Mail data stores tend to be relatively read mostly. Normally, it is unlikely that multiple clients attached to different servers would be updating the same file or directory. There may be some contention between incoming email messages and clients reading them, but the latency should not be a significant issue. Maildir format stores should have relatively lower contention that other formats. However, they have relatively high rename and move activity which may cause issues if your nodes become disconnected.

Distance: Wire data travels over wire at about 30 cm in a nanosecond, 300 meters in a a microsecond, or 300 kilometers in a millisecond. This adds significant latency as distance increases.
Switching time: Each switch a packet passes through need to be examine, route, queue and transmit the packet. This adds additional latency which increases as the switch gets busier.
Network congestion: Networks can get congested causing additional delays as traffic is queued longer and possibly re-routed. If congestion is bad, the delays may be long enough to trigger packet re-transmission.

The issue "switching time" and "network congestion" may be more present on intercontinental internet connections with latency above 100 ms, but its not like this can't happen in a large on premise network also. So, when I say my servers are connected with a latency below 4 ms, are they still prone to data corruption? I am often connected with multiple clients at the same time such as smartphone, table or pc. — Chris, Aug 14 '17 at 08:27
@Chris I would expect it is unlikely you will be modifying the same email at the same time on two or more clients. If you are using Maildir as your store the most likely collision would be two clients moving mail from the `new` subfolder to the `cur` sub-folder. This would be moderated by the IMAP server(s). Even if both IMAP servers where to do the same move at the same time, the change should be identical so collision recovery should have no issue. Your appears to be a low risk use case. — BillThor, Aug 15 '17 at 00:30

Why are GlusterFS replicated volumes not recommended for Hosts in different datacenters?

1 Answers1