Algorithm for distributed messaging?

Question

I have a distributed application across which I'd like to replicate a single, eventually consistent state. The data is suitable for a CRDT (http://pagesperso-systeme.lip6.fr/Marc.Shapiro/papers/RR-6956.pdf) which has the excellent property that each node, given the same set of messages, will deterministically converge to the same value without complicated consensus protocols.

However, I need another messaging/log layer that will ensure that each node actually sees every message, even in the face of adverse network conditions.

Specifically, I'm looking for an algorithm that has the following properties:

Works on an asynchronous network.
Nodes are only necessarily aware of their neighbors, not the whole network.
Nodes may be added or dropped at any time (that is, the network is not of a fixed size or topology).
The network can be acyclic (this can be a requirement, if necessary).
Is capable of bringing up to date a node that has become behind due to temporary network outage or dropped messages.
Is capable of bringing a new, empty node joining the cluster up to date.
There is not a hard limit on the time taken for the network to converge on a value (that is, for every node to recieve every message), but given no partitions it should be fairly quick (in fuzzy terms, a matter of seconds, not minutes).
Is bounded in size. Algorithms that keep the entire message history (which will grow boundlessly) are unsuitable.

Is anyone aware of an algorithm with these properties?

How would it be able to bring a new node fully up to date without storing the entire message history? Is the latest calculated value of a node + all subsequent messages enough to "sync" up a new node to the current state of the system? — lukevp, Nov 11 '14 at 17:41
Depends on the algorithm, but it should be possible, yes. A node should be able to replicate it's state without storing the entire history. — levand, Nov 11 '14 at 18:39
Can a node replicate another node's state without storing the entire message history? Is the effect of receiving a message idenpotent, so that I delete any memory of what I have received from some node and then receive the message again later no damage is done? Or failing that, is the number of nodes bounded at a reasonable size so that a node can keep one data structure per node that it ever knew about? If so, then a fast algorithm is possible. A naiive implementation is: "periodically: for i in neighbours: git pull repos on neighbour" and set a post-commit hook to push to neighbours. — Max Murphy, Nov 14 '14 at 18:37
Perhaps this might be of interest to you? http://www.netcod.org/papers/16HoLKM-final.pdf — user3614014, Nov 18 '14 at 16:03

Algorithm for distributed messaging?

0 Answers0