1

I'm trying to enforce split-brain protection when machine B takes over on a failover scenario. Basically I want to enforce that machine A is really out before machine B is activated, avoiding the infamous split-brain scenario.

So I need a software or hardware solution that allows me to remotely and efficiently kill machine A by cutting is power. That's the STONITH approach, or Shoot The Other Node In The Head.

How can that be done?

Dickinson
  • 11
  • 1
  • 1
    Via [corosync and pacemaker](http://clusterlabs.org), and the corresponding resource agents. – gxx Oct 14 '17 at 16:13

2 Answers2

1

Switching off power of a server in such a situation is normally done via IPMI or a switchable power supply unit with network access.
Since the split brain situation implies something odd which can be due to network outage, you normally hook up this network on a separate switch.

Second you configure corosync/pacemaker as outlined by gf_ already to switch off the other node. In a two node cluster, you will have the problem to choose which node will survive and you normally have a odd number of nodes. There are possibilities to overcome this, but that depends on your needs and expectations.

Thomas
  • 4,225
  • 5
  • 23
  • 28
0

Besides IPMI you can also go with API embedded with Virtualization platforms like KVM or VMware. The idea is to immiediately turn off the VM (if the cluster is based on Virtual Machines of course). I believe it can also be done for GCE/AWS however it would require some scripting on admin side (writing own STONITH Agent).

https://www.hastexo.com/resources/hints-and-kinks/fencing-libvirtkvm-virtualized-cluster-nodes/