I'm researching how can we implement near-realtime replication from primary datacenter to a disaster recovery site. Data that would get replicated would be:
- Images of KVM VMs
- MySQL and PostgreSQL databases
For the sake of simplicity let's assume it's less than 10TB of data in total with average write speed of under 100MB/s, peaking at 1500MB/s and link between the primary and backup datacenter would have throughput of 10gbit/s.
Asynchronous replication is acceptable and desired - in case of bursty writes or short outage of the connectivity between both datacenters - we don't want to slow down local write speed and are willing to sacrifice the most recent portion of data that might be lost in case of catastrophic failure affecting the primary datacenter.
My understanding is that we can choose between:
- proprietary SAN hardware that can come with replication feature and can provide iSCSI LUNs
- DRBD that will likely need to be combined with DRBD proxy [ to make sure that temporary drop of available bandwidth or latency spike between both datacenters does not affect write performance at the source ]
- Software-based solutions like http://schoebel.github.io/mars/, which - sadly - will take quite a while to be merged into mainline kernel in the best case scenario
- For DBs - database-level replication is an option as well but - we'd like to carry occasional DR tests for which we want to switch all workloads between the datacenters. Failing back from the DR site to the main site would be quite cumbersome.
Are there any other solutions worth considering?
Thank you!