This is a difficult question to answer completely. It really all depends on your goals, and expectations.
Here are 4-potential solutions, which get progressively more complicated, and require more periodic attention:
The quick & dirty solution is to simply run rsync on a cron-job, (every minute?) and replicate the files to & from the servers. This is a bit dangerous, as you can end up with files modified in both-places and end up with a mess.
More complicated, you can help reduce some of the potential periods where both servers could modify the files, by setting up a bash script using inotifywait
to wait until a file/directory is modified, and then immediately rsync as-soon-as-it-happens.
Still more complicated, you can work with file-system frameworks such as "glusterfs" which replicate files in large-scale environments. This option is not for the weekend-warriors out there. It can fail spectacularly (which I've experienced), due to slight misconfigurations, and cause no end of headaches. When it is properly configured and setup, it can work amazingly well.
Finally, you can go the extreme route, and actually have a block-level replication between the servers using something like drbd
, which (in most cases) is the extreme route. In my own data-centers, I have setup replicated block storage devices, with NFS on the surface in HA virtual environments with great success. Storage arrays can die at any point, and the virtual machines running on them never see more than a few ms of delay.