DRBD Madness on Virtual Environment (XEN)

Question

Right now I am using DRBD to replicate two directories (/var/www and /var/spool/mail) on two different XEN VPS's and they're 7000 Miles away from each other! on the top of that I am using a transparent IPSec tunnel VPN to connect both nodes on the private level, doesn't seem fair I know, right now I am placing the (www and mail) folders on the DRBD directory and I just soft-link them to every machine, it's working and replicating but since I have too much load on the network level (distance and security) my disk read/write speed is horrible, I open a webpage in 6 mins and even more, I have mail delays and at the end of the day, I face (dual-split-brain) and both nodes are restarted that's when DRBD brings both nodes as Secondary, the mount process will never happen which leads to no active root document for apache to start and at this exact point redundancy kills availability!

I am trying to release the load on the DRBD partition to speed things up a little so I copied both directories back to their original location and made a soft-link to each of them on the DRBD partition but this never worked, right now I need good suggestions! (I am using OCFS2 BTW for DRBD partition)

The way you have designed your environment is poor, that's really the only answer here. You cannot expect DRBD to perform properly under this sort of configuration. — ThatGraemeGuy, Feb 13 '14 at 14:40

score 5 · Answer 1 · edited Feb 19 '14 at 16:37

5

What about "do not replicate over a 7000 mile slow and high latency link to start with".

Replication on a DRDB level has it's place, but you basically abuse it: It was designed for low latency high bandwidth scenarios ONLY. You can also use it "asynchronously" in a disaster recovery scenario where it's acceptable to lose some data as the replication falls behind and catches up.

If you're not in one of those two scenarios just forget the idea of using something like drdb. Organize local data in the data centers and use replication and backups to pull things down sensibly.
For example it makes little sense to replicate a mail spool. Web stutff (sites etc.) - no sense either since you can use other tools to distribute that data.

If you take a special use technology, ignore the limitations and put it into a situation it is not made for you get exactly the disaster you describe here.

DRDB is a high availability function for local machines. It allows one to repliate a file system in case a machine fails. It is not designed to handle WAN scenarios unless you use async and that is "write out" (i.e. making a copy to an offsite location). And even then you still must have the bandwidth to handle it - and that can be taxing (as in: 1gbit+).

edited Feb 19 '14 at 16:37

voretaq7

79,879
17
130
214

answered Feb 13 '14 at 13:31

TomTom

51,649
7
54
136

I totally hear you and understand every word you're saying but the thing is that both servers are working on a roundrobin DNS because I need to balance the requests, those are very busy servers and they need to be updated instantly, backups are not useful in my situation! – user204252 Feb 13 '14 at 13:38
1

Actually they are. See, if you structure the website correctly most stuff does not need replication but can be moved "on demand" (2 copies of the files and config). Database replication is better done on the database level special software. Using DRDB for everything is not what it is use for in this area and it makes the problem worse. For example a db update will be a LOT more traffic on disc level than on a specialized db internal replication mechanism. If you need a distributed application, plan and build one. – TomTom Feb 13 '14 at 13:45
I am already handling DB replication through mysql Dual-Primary mode, I am not using DRBD for DB because at some point it can get messy, my only problem is the web root folder and the mail folder because once you've changed to the other node because of the roundrobbin, you shouldn't make a difference, is there anyway to link directories to each other without stressing things more, drbdlinks, Unison...etc what do you suggest? – user204252 Feb 13 '14 at 13:47
Well, mail is simple - do not use and replication. Both sides get heir local email server. Finished. Web same - local storage, replicate changes manually. Specific tips - not me, no real expeience with drdb (only similar technologies on a SAN level). I also only do windows, so - the deep internals of Linux... wrnong person. – TomTom Feb 13 '14 at 13:54
1

Why is the web root a problem? You store dynamic updates from users there? Then you can have the servers replicate manually ;) Or use rsync or something similar. Plan for time delays... links may go down. – TomTom Feb 13 '14 at 13:54
No mail is not simple, Imagine having a mail on a server and you're on the other server, you have to wait till you're switched to the other server to get mail and rsync is one-way replication it will never serve my senario, this way I will have to baby-sit the 2 nodes! – user204252 Feb 13 '14 at 13:57
Actually no. See, mail is simple for sending because mail is a protocol with redundancy. And mail is simple for reading because - there are solutions doing that already. Heck, you could fake email with autoamted forwarding rules between 2 servers (outside read state management) most likely. – TomTom Feb 13 '14 at 15:19
Sending is fine, we agree on this point but receiving shouldn't be complicated because I already have a complex setup, one thing goes wrong and the whole thing will blow, I tried changing to protocol B in DRBD and see if it make any difference, I just need a slight improvement, not light speed – user204252 Feb 13 '14 at 15:57
1

The problem is that drdb is a disc sync. If it is async then you blow consistency. THis is not solvable - async disc replication is only a desaster recovery measure. Always. Because you have no sync points. – TomTom Feb 13 '14 at 16:44
it didn't even work since it's a dual-primary setup, protocol C is required – user204252 Feb 13 '14 at 20:04
1

1. Mail: 1 server, all the mail is there. Rsync maildirs to the other server every 15 minutes, 7000 miles isn't far to download email, POP and IMAP don't have receive instantaneous replies. 2. Web root, keep dynamic content in DB, when static things are updated, ensure that it is only on one server, do periodic rsyncs (disable ftp on secondary even). 3. Drop DRBD, you're doing it wrong. – NickW Feb 19 '14 at 16:52

score 1 · Answer 2 · answered Feb 19 '14 at 17:00

As TomTom and ThatGrameGuy have pointed out, your design assumption (that you can achieve what you want with DRBD) is flawed. DRBD is useful for synchronizing Block Devices (so says the name: Distributed Replicated Block Device). You could theoretically use it in your current scenario as TomTom describes (Async mode, for things where it's OK to lose some data), but you're not describing anything where that situation would exist.

It also seems that you're making this far more complicated than it needs to be: It sounds like you just need a simple "Primary/Secondary" environment.

For Web Stuff
Web sites are changed periodically. It's easy enough to make a back-up of the web site that can be restored to a remote server (or store everything in a version control system, or use a configuration management system to "push" the website to multiple servers).

For Databases
Web sites are often database-backed these days (which is usually the only part that changes "continuously") - but every database engine worth using has some kind of replication capability (you say you're already using this).
Configured properly DB replication is way better than DRBD for replicating databases, because the remote DB engine guarantees the same level of ACID that the master server has.

For Email
Like TomTom said in his answer there's no point to replicating your outgoing mail spool.
If you lose the master server with one or two emails in the queue your users can re-send them, and it's a corner-case anyway because unless the recipient's server is down the email is off your system in a few seconds. Not worth worrying about.
People's mailboxes are another story: Here you're going to want backups (or a mail system that supports replication). This may mean there's an hour or a day where people don't have access to their old email when you fail over to the secondary server (while you restore the old message), but that's usually OK because they're getting their current emails. If the restore time is not acceptable you can continuously restore the backups to the secondary server (or use something like rsync to keep the mail boxes in sync every few hours).

There are a couple of edge cases in what I described above that you should be aware of.

One, If your servers are "very busy" you may need to load balance properly (using something like HAProxy to distribute web requests between "front-end" servers, and moving mail and DB onto their own servers). That's how you scale out properly.

Techy computer-sciency explanation: the DRBD hackery the bandwidth requirements with DRBD are close to O(N^2) where N = the number of nodes, the solution I've outlined is roughly O(N) where N = the number of DR sites - and the number of DR sites is not likely to exceed 2).

Two, If your web servers write data to the local file system you will need to re-architect that solution (store the files in the database, or in a NoSQL database like MongoDB, or to a central storage server (over NFS or something similar, and possibly replicating THIS with Async DRBD to your off-site location for near-real-time disaster recovery) - basically some solution to ensure that the local-file writes get made available to all the other "front end" servers.

Note that 7000 miles isn't far to travel at light speed (the light speed RTT is 0.07 seconds -- 70ms). If we assume internet routing and congestion triples that time we're talking 210ms (which is about right since pinging California from New York is about 90-100ms) - your farthest users can deal with a 0.2 second lag - it's probably less painful than the bandwidth saturation or sync-delay lockups you'll face with DRBD! — voretaq7, Feb 19 '14 at 17:07
70 ms TERRIBLY high in this case though, because we talk of basically making BOTH hard discs 70ms latency higher. Not sure which plane you live on, but on my ocmputers a 70ms hd latency shows up brutally. I keep relevant core latency (outside storage pools) low single digit. — TomTom, Feb 26 '14 at 15:33
@TomTom I believe you failed to read my answer, or grossly misinterpreted it (I explicitly say that using DRBD - or really ANY synchronous replication technology - is a terrible idea here, because of precisely the issue you're talking about). I am *exclusively* referring to the delay users will experience by not having a "local" facility hosting the site they're accessing (e.g. someone in China having to talk to New York to access the site) or to forward "writes" back to a location 7000 miles away. 70ms RTT is *barely* perceptible to a human in such cases. — voretaq7, Feb 26 '14 at 17:28
es, that is correct ;) Just saying that these same 70ms on something like remote drdb will make a server crawl. Terribly, sadly. — TomTom, Feb 26 '14 at 17:39
@TomTom Yeah, that's why you don't use synchronous block-level replication unless the machines are connected port-to-port by 10GbE :) — voretaq7, Feb 26 '14 at 17:55

DRBD Madness on Virtual Environment (XEN)

2 Answers2