We are developing architecture of the distributed system based on Apache Ignite. The system has strict requirements for fault tolerance.
We have three data centers (DCs) for this: two of them are main DCs (DC1 and DC2) and one is reserve DC (DC3). We have a fast Ethernet channel between main DCs. DC1 and DC2 are connected over 40 GbE. Reserve DC3 is connected to the DC1 and DC2 over slow channels 1 GbE.
We plan to use ZooKeeper Discovery for Ignite Cluster and want to place ZooKeeper Cluster nodes to the three DCs: 1 node per each DC.
We plan to place Ignite Cluster nodes only in main DCs (to the DC1 and DC2). DC1 and DC2 will have an equal number of Ignite Nodes.
What happens with Ignite Cluster when network segmentation occurs if 40GbE channel between main DCs DC1 and DC2 will be down?
For example, ZK3 node in DC3 is leader, ZK1 and ZK2 are followers, in this situation the leader node can communicate with both followers, followers lost connection with each other. ZooKeeper Cluster keeps in the ensemble.
Ignite Cluster nodes from DC1 can communicate with ZK1 and ZK3 nodes and between each other in DC1. Ignite Cluster nodes from DC2 can communicate with ZK2 and ZK3 nodes and between each other in DC2.
How the split-brain situation will be resolved in this network segmentation case or we get two independent Ignite clusters?
Documentation https://apacheignite.readme.io/docs/zookeeper-discovery#section-failures-and-split-brain-handling tells:
Whenever a node discovers that it cannot connect to some of the other nodes in the cluster, it initiates a communication failure resolve process by publishing special requests to the ZooKeeper cluster. When the process is started, all nodes try to connect to each other and send the results of the connection attempts to the node that coordinates the process (the coordinator node). Based on this information, the coordinator node creates a connectivity graph that represents the network situation in the cluster. Further actions depend on the type of network segmentation.
Can the coordinator choose one of a half Ignite cluster as main in this case?