Let's say I have an internal system of 20+ nodes that pass data back and forth to each other through sockets where low latency is a high, high priority. How do I design it so that if a random server(s) goes down, I can recover/resend the data that was already sent but not processed by the downed server?
For example, if A was streaming data to B, but at some point B goes down without processing some of the data. If we assume A can detect that B went down, and reroute the data to C, how would I design it so that I know what data was sent to B, and now should be rereouted to C?
I'm assuming I'll have to rely on the various message queue software out there, but I'm wondering if there is also another easy way to do this!