My team and I invented a method to solve the split-brain problem of the two-node system in engineering and published a paper. We are not a well-known team, but we think this method is very new and practical, so we want to discuss it with everyone to see if everyone thinks this is a major innovation.
The problem we try to solve
Let me first describe the problem we want to solve. In a two-nodes distributed system, if the link between the nodes fails and there is no third node, then the system composed of these two nodes cannot make leader election with both availability (liveness) and consistency (safety). This makes it impossible to design a two-nodes distributed storage or database system.
In order to solve this problem, engineers have thought of many ways. Some use more reliable hardware between the two nodes to avoid link failure, and some use a third node or shared medium for arbitration. But this puts additional requirements on the hardware.
The solution we proposed
In this paper, we propose a new method that neither relies on an additional third node or shared medium, nor a reliable link. This method is called a "level-based leader election algorithm". But this name is not used in the paper.
Suppose this is a distributed storage and database system that is partially synchronized (or eventually synchronized, or semi-synchronized), which is composed of S server nodes. There are C client nodes accessing them.
- When S>=3, the S server nodes can use any of uniform consensus algorithms, such as Paxos, Raft, to elect the leader. In this case, availability (liveness, the leader will be eventually elected) and consistency (safety, no different leaders in any time) can be guaranteed.
- When S<=2 and C>=1, the client node will also participate the leader election process. As long as the number of client nodes C>=1, the number of total nodes is not less than 3 and the uniform consensus algorithm can be used to elect the leader in a partial synchronization system. However, only the server node has the right to vote and be elected, while the client node only has the right to vote, not the right to be elected.
- When S<=2 and C=0, although the total number of nodes participating the leader election process is less than 3 and thus there is no algorithm can guarantee both availability and consistency, but there is also no request from client node at all! We can choose either one of the two server nodes to be the leader, since availability or consistency is not needed!
Request
I hope you can help us see if this method can be regarded as a major innovation. In addition, we plan to add the level-based leader election primitive into current storage and database request/response protocol. If any one has interest, please let me know.