0

I appreciate that I'm asking multiple questions on the same topic, but they are all related to the same purpose.

Working a horizontal scaling cluster setup, and trying to setup unison to sync "var/www/html" for HA.

To sync between 2 servers is easy and works like a charm, however there will be 10+ servers connected via vLAN.

After a lot of searching I can see most people and even the unison docs recommend "star topology" setup:

enter image description here

However I may have just misunderstood the setup, or my worries are true ( you tell me ).

Star Topology:

In "star topology" setup, a "hub" server pushes changes to the rest of the servers.

For example we have servers : A (hub),B,C,D,E,F. If we add/change something on server A, it will sync it with the servers B,C,D,E,F.

However since I will host websites in "/var/www/html", what happens in the scenario where:

  • A load balancer is used in front of all servers
  • A wordpress website is hosted across servers
  • An author adds a blog post with images, but he does it on let's say server D since the load balancer will "land" him on any of the servers

I would like an explanation for this, is it a case of you need to push from each server to A ?

Some example setup script would be very appreciated.

Fully Connected Topology:

  • Can this be achieve with unison ?
  • Is it better and more reliable than star topology ?
  • How would the setup script look like on each server ?

Many thanks to anyone that will give feedback!

Mecanik
  • 103
  • 3

1 Answers1

0

Since you're aiming to sync a website between these servers, what you need is to make sure the servers are in sync all the time; that there is little or no delay between the files being changed on one server and getting Unison to update those changes to another server. This can be done pretty easily with the Unison option repeat=watch and maybe using inotifytools.

Tldr: The star topology avoids headaches that come up with the fully-connected topology.

The star topology setup can sync changes pretty instantly, but it requires Unison to be run a couple times. Suppose that the load balancer lands a user at server D and the user uploads an image. Then if Unison is running with the repeat=watch option, basically as a daemon watching your files for changes, then it'll start syncing to the hub node A as soon as the image is uploaded. Now you need to trigger Unison to run between A and the other spoke servers in your setup. Ideally you would want to split this work up among the spoke nodes, as opposed to running a bunch of instances of Unison on A to push to the spokes. So I'd use inotifytools on A to watch for changes, and whenever a change occurs have A send a command to each spoke to run Unison to fetch the changes on A.

In contrast there is a complication that comes with a fully-connected setup, especially if just using repeat=watch to sync thing instantly. Suppose a user uploads a file to server D. Then in your fully connected setup you'd have Unison run one at a time, once for each other server to sync that file. So first D syncs to A, then D starts to sync to B, but because A has changed and is now out of sync with B, it will also run Unison and try to sync to B, and now B is trying to get updated from two sources at once ... and this might make Unison cranky at the least. Then goodness forbid that you get conflicting changes on two servers, like say a user uploads their file to D but before everything syncs another user uploads a file of the same name to E, in addition to this first headache.

Mike Pierce
  • 257
  • 1
  • 11
  • Thank you very much for the clarification, I will go ahead with "start topology". However I do need a bit more information on the setup... If I understood correctly, B,C,D,E,F will ONLY sync to A ( one way ? ), and A will sync to B,C,D,E,F ( one way or bi directional ?? ), is this correct ? Could you share some pseudo/boiler template script for unison ? (.prf perhaps ? ) – Mecanik Apr 09 '19 at 09:08
  • Just note that the star topology creates a single point if failure (host A in these examples). A clustered filesystem may be easier to setup. – ndemou Aug 23 '20 at 08:32