Node script - Failover from one server to another server

Question

I have a nodejs script - lets call it "process1" on server1, and same script is running on server2 - "process2" (just with flag=false).

Process1 will be preforming actions and will be in "running" state at the beginning. process2 will be running but in "block" state with flag programmed within it.

What i want to acomplish is to, implement failover/fallback for this process. If process1 goes down flag on process2 will change, and process2 will take over all tasks from process1 (and vice versa when process1 cames back - fallback).

What is the best approach to do this? TCP connection between those?

NOTE: Even its not too much relevant, but i want to mention that these processes are going to work internally, establishing tcp connection with third server and parsing data we are getting from that server. Both of the processes will be running on both of the servers, but only ONE process at the time can be providing services - running with flag true (and not both of them)

Update: As per discussions bellow and internal research/test and monitoring of solution, using reverse proxy will save you a lot of time. Programming fail-over based on 2 servers only will cover 70% of the cases related with the internal process which is used on the both machines - but you will not be able to detect others 30% of the issues caused because of the issues with the network (especially if you are having a lot of traffic towards DATA RECEIVER).

@Amit Can you please elaborate that at least little bit. You are suggesting to add additional server in between which is going to handle this? Not sure I am following? These scripts will not be publicly available (serving http requests) if you are referring on placing reverse proxy before servers for that case. — cool, Aug 12 '15 at 12:21
then I guess my comment is irrelevant, but your question isn't telling enough of the story (in my opinion anyway). What is the process doing? is it a single long task, or serving ongoing requests? what does it mean "goes down" and how can this be identified? — Amit, Aug 12 '15 at 12:36
@Amit I agree with you. Updated question already. Regarding your question "what does it mean goes down?", If script on server1 is not working/process went down/killed/ its one case. Or when network failure happens on one of the servers (and server is not reachable) — cool, Aug 12 '15 at 12:47
So is that 3rd server able to recognize a server being "down"? Can it do something about it (restart something, switch host internally...)? If the answer is that it can "handle the situation", it probably should. Otherwise you could use a proxy or let your "pending" server be a tunnel - if srv1 is up, forward, otherwise - process. — Amit, Aug 12 '15 at 12:56
@Amit Unfortunately third server is not controlled by me (consider it as server which have api described via tcp and im just using it), and I am not able to send/manipulate from there at all. — cool, Aug 12 '15 at 13:01
It's not clear how things are related and what is (or should be) responsible for the failover. Please edit the question again, describe each component and the EXACT relation, and particularly describe how down server are recognized and how you're thinking it might be solvable. It's very helpful if you provide some solution (even if you think it's poorly designed) just so that the exact requirements are clear. — Amit, Aug 12 '15 at 13:09
@Amit Updated. That is the situation i was trying to explain. As logical solution (actually only solution i was able to think about) is to establish tcp connection between those twos. But I am not sure am i able to conclude on both of the sides (when connection goes down), from where cause came from (s1 or s2) — cool, Aug 12 '15 at 13:26
Just 2 last pieces of the puzzle: 1) Is "Data Provider" connecting to both servers? is that configurable? can it be any other server, or can it be larger than 2? 2) In your diagram only srv1 connects to the receiver. is that just to show that only 1 is connected (or posting..) at any given moment, or that it HAS to be srv1? (P.S. it doesn't look like this has anything to do with node specifically, more of a distributed system architecture) — Amit, Aug 12 '15 at 13:33
In my case there is only 2 servers connected with "Data Provider" and it will not be more then that. Regarding connection by receiver, yes i left connection from s2 to indicate that only 1 is able to post at the time. (it does not need to be s1). Both of the servers are able to post to the Data Receiver, just condition is that, when one is posting, second one is not able to do that. — cool, Aug 12 '15 at 13:47

score 2 · Accepted Answer · answered Aug 12 '15 at 13:39

2

This is more of an infrastructure problem than it is a Node one, and the same situation can be applied to almost any server.

What you basically need is some service that monitors Server 1 and determines whether it's "healthy" or "alive" and if so continue to direct traffic to it. If the service determines that the server is no longer in a stable condition (e.g. it takes too long to respond, returns an error) it will redirect any incoming traffic to Server 2. When it's happy Server 1 has returned to normal operating conditions it will redirect the traffic back onto it.

In most cases, the "service" in this scenario is a reverse proxy like Nginx or CloudFlare. In your situation, this server would act as a buffer between Data Reciever and your network (Server 1 / Server 2) and route the incoming traffic to the relevant server.

answered Aug 12 '15 at 13:39

James

80,725
18
167
237

So basically, its impossible to implement stable solution without adding additional proxy server in between Data Receiver and those 2 proxy servers? – cool Aug 12 '15 at 13:44
It's certainly not *impossible* but it would probably be more efficient & reliable to use a reverse proxy in this situation (it's exactly what they were designed for). – James Aug 12 '15 at 13:46
Regarding "impossible" solution i was referring on stable one. I will base this implementation on tcp connection (both ways) between processes. It will be quite stable, but not nearly enough as reverse proxy solution. – cool Aug 12 '15 at 14:26

score 1 · Answer 2 · answered Aug 12 '15 at 13:58

That looks like a classical use case for a reverse proxy. Using a well tested server such as nginx should provide plenty reliability the proxy won't fail (other than hardware failure) and you could put that infront of whatever cluster size you want. You'd even get the benefit of load-balancing if that is applicable and configured properly.

Alternatively and also leaning towards a load-balancing solution, you could have a front server push requests into a queue (ZMQ for example) and either push from the queue to the app server(s) or have your app-server(s) pull tasks from the queue independently.

In both solutions, if it's a requirement not to "push" 2 simultaneous results to your data receiver, you could use an outbound queue that all app-servers push into.

Node script - Failover from one server to another server

2 Answers2