4

I'm looking to create a high-availability, scalable networking solution by using a distributed system of data. A node here, describes a network that has control over one copy of the data. These nodes might contain more than one machine but has one copy of the data.

The nodes will contain data records which can be in a spent state or an unspent state. A client can request a transition for a record to go from an unspent state to a spent state (a request to spend). There is a security risk if they can successfully do this more than once.

A single node, if it has a connection to all other nodes, can tell the nodes that a spend has been requested and can ensure no other nodes want to access the data and that the spend has not occurred already. The node can change the state of the data to spent and other nodes will not do this since they know one of the nodes is updating it and processing the spend. All nodes will change the data, so the record is in the spent state.

If a node cannot reach another node, it can assume the other node is down and will continue operating with the other nodes until the other node comes back up. In this case the node will send all updates to the node that came back up. If this failed node was in the middle of a spend operation that was incomplete at the time, it can complete it then. This would cause minor downtime for some operations. This would be in the case where a node tells the other nodes it will spend and then fails before it can complete the spend process. In this case the other nodes are blocked from updating it so the failed node needs to come back online before it can be completed.

The problem is, the processing for the spend can only happen once. If the network was partitioned, an attacker knowing this could request the spend on one partition and also on the other. Each partition of the network would assume the other to be down and so would operate independently. This could cause the spend to be processed more than once.

This would not be a problem if the request to the two sides of the network was not being made during the time they were partitioned. The network would become eventually consistent when the connections are re-established. If an attack was successful, the nodes would learn about the attack when they re-establish connections because two sides of the network would announce the same change.

So it is detectable attack but is it practically possible?

An attacker would need to be deliberately trying to do this. The software is not designed to make several spend requests at once. There is a time cost to the attack. If the attacker fails, it will take time before they can recreate an unspent record. Creating unspent records requires money. And more money will need to be used in a single attack to get a higher benefit. The reason there is a time cost, is that it would take time to receive the money back to try again. They could afford many smaller attacks and then the benefit to them would be less and the damaged caused, less too.

Surely partitions are so rare naturally, that an attacker would have to be ridiculously lucky to win, if attempting attacks at any time?

If a connection is naturally lost, a node can halt all operations and try a reconnection. Using a low timeout for the connection to the node means it doesn't have to cause any downtime (Perhaps only rare increased latency). If the reconnection fails then it will continue trying but then restart operations (assuming the node is down). Would something like that protect against occasional connection errors?

So would an attacker be able to detect/cause a partition in the network? How likely is it that partitions will occur and for how long? What ways can issues be resolved if possible?

Thank you.

Greg Askew
  • 35,880
  • 5
  • 54
  • 82
Matthew Mitchell
  • 255
  • 1
  • 2
  • 6

3 Answers3

5

Having dealt with similar issues in Clustering scenarios, I'm familiar with the situation you describe. Such systems frequently have the concept of a quorum, which is why such systems require an odd number of member nodes. The quorum is used to determine the majority and minority partitions.

The quorum is the number, greater than half, that defines what is the minimum number of available nodes that needs to be present to provide services. If a network partition happens only one partition will have quorum, the other stop services until the partition goes away. If a multiple partition event happens it can lead to no services being provided at all. However, it does guarantee only-one node is serving, and that's how consistency is provided.

As for the likelihood of a partition, that depends on your infrastructure and how your nodes are communicating availability state to each other.

As for their ability to detect a partition event, that depends on your code. The main thing that would make such an attack possible is if both partitions are independently addressable during a partition, which may not be the case. In my experience, network partitions frequently exclude end-users from one partition as well as the other nodes. If the partitions are not addressable, then this attack is a lot less likely to succeed.

sysadmin1138
  • 133,124
  • 18
  • 176
  • 300
  • It is possible when one client does it from US and another from Japan. Two nodes can be geographically displaced and fully addressable in this case. When doing such money transaction, the easiest is to use central memory - SQL with transactions, and if it's unreachable, then it's failed. – Andrew Smith Aug 18 '12 at 16:26
  • Thank you for the answer. The quorum solution is very appealing although quite complex to ensure correct communication between nodes. From how I understand it, to process a spend request a node would need approval by the majority of other nodes. Considerations would need to be in place where no nodes can achieve a quorum and thus needs to reset. And it must be ensured that nodes do not vote for more than one node to do work. However I feel it is indeed achievable. – Matthew Mitchell Aug 19 '12 at 14:19
  • The solution does provide availability of the system in the cases where a minority of nodes are down or are not reachable between each other but not if a majority cannot be reached between each other. Still, what are the chances that the majority of nodes go down at the same time (unless a bug in the software)? This solution would also require more communication between nodes than longneck's solution but creating an extremely lightweight communication protocol should not be difficult and multiple requests/responses can be sent together when nodes are busy. Definitely worth experimenting with. – Matthew Mitchell Aug 19 '12 at 14:25
1

Distributed storage works best with single origin of data being replicated every n seconds, using e.g. SQL index and using replication rules, were to push it. Also central memory "SQL" to control the states.

So simply, when you change object state, this is being communicated to the origin node, and the transaction is performed in SQL with the lock on the record.

If node cannot reach the origin at the time, the operation must fail, as the origin state is only on origin server.

This is like origin-edge workflow, were origin has "memory" - states, and edge "content" - objects.

Is it theoretically impossible to bypass the above model of the edge and central memory while preserving the security and do it in a simple manner. The above model is the most efficient and most correct, and fuzzing it just makes life difficult.

Andrew Smith
  • 1,143
  • 13
  • 23
  • Thank you for the answer but your idea creates a single point of failure which is not what I want. – Matthew Mitchell Aug 19 '12 at 14:10
  • For the availability layer above, one can do double internet connection and redundant servers, and this does the job most often, when the data is flowing upstream, but distributed I really think this is just an headache, I prefer to make it a tree, however this is not easy – Andrew Smith Aug 20 '12 at 18:28
  • I would not want to put trust in a single data-centre. Any downtime would be very bad for the service I'm providing. – Matthew Mitchell Aug 21 '12 at 17:39
0

If you're looking for a practical solution to allow transactions to continue while there is a partition, I have an idea.

For each new data record that is created as unspent, assign it to a single node. While the network is partitioned, the data records assigned to the reachable nodes are the only ones allowed to be spent by the clients. When the partition is resolved, all of the nodes resynchronize which data records are spent. Since only modes that were reachable by clients spent records assigned to those nodes, there should be no over-spent records.

Thought will have to be put in to how records are assigned to nodes, and what to do when a node runs out of its own records during both unified and partitioned operation.

longneck
  • 23,082
  • 4
  • 52
  • 86
  • Like MySQL replication would work like this, but when you query MySQL in US and Japan, you need to lock the record in every location, and re-synchronise after shutdown. – Andrew Smith Aug 18 '12 at 18:41
  • Um, no. The point of my suggestion is to assign a record to a node/location when it's created. Records are only used if all nodes agree, or if the record belongs to the same node as the client. As long as you follow that rule, no double assignment is possible. – longneck Aug 18 '12 at 23:09
  • I had the idea of dividing work between nodes. This is a simple solution and has some advantages. The advantages being that if a node is taken over by an attacker, the attacker wouldn't have complete access to all records; there would be some functionality for failures down to a single operating node and it is a reasonably simple solution without needing much communication between nodes since each node independently operates on their own records. However, the solution means if a node goes down, some of the records are not spendable. – Matthew Mitchell Aug 19 '12 at 14:14
  • Well, the alternative is that whatever nodes do not have quorum will not be able to spend anything at all while the network is partitioned. So you have to decide between: during a partition, some nodes can either a) not spend at all because they do not have quorum, or b) spend what is available to them and redistribute once the partition is resolved. Only YOU can decide what the best compromise is for your business model. And there will be a compromise somewhere if you want resiliency. Anything else is full functionality. – longneck Aug 19 '12 at 16:14
  • I can program the client to try other nodes if one node cannot achieve a quorum. So if the client makes a request to a node in a partition without the quorum, it will eventually choose a node with a quorum. So the client will be able to spend. – Matthew Mitchell Aug 19 '12 at 20:00
  • As long as the client can reach a node that can achieve a quorum that is. – Matthew Mitchell Aug 19 '12 at 20:01
  • If in a partitioned network a client can contact multiple nodes until it finds a node with quorum, then why can't a node do the same? I don't think you have thought this all the way through or defined partition correctly. – longneck Aug 19 '12 at 23:21
  • What if you have nodes A, B and C. AB are connected and C is left on it's own. A client may be able to access A, B and C and use the AB partition or would that never be the case? – Matthew Mitchell Aug 19 '12 at 23:33
  • let us [continue this discussion in chat](http://chat.stackexchange.com/rooms/4567/discussion-between-matthew-mitchell-and-longneck) – Matthew Mitchell Aug 19 '12 at 23:33