Single fault tolerant machine with amazon AWS

Question

For a particular service, I need to run a single EC2 instance in a fault tolerant way.

Only in case of errors I want that the "primary" machine is terminated and the traffic must be be redirected on "secondary" machine within some seconds and automatically. This is the classic case of a primary and secondary server with the constraint that the secondary server must not work unless the primary crashed.

I'm quite new in this world but as far as I understood, with Elastic IP I need to manually change the binding if the primary machine hangs. Instead, with Auto Scaling, ELB and CloudWatch I can:

Set up an auto scaling park with 2 machine, but the traffic will be load balanced (sticky sessions is not what I want because I need all the traffic on the primary machine if it works)
Set up an auto scaling with just 1 machine, so if the primary machine hangs automatically a new one will be online. However as far as I know the boot process needs several minutes.

Any advice on how I can combine AWS services to achieve this goal?

what kind of application? and will the backup machine already be running, i.e. a 'hot spare'? — E.J. Brennan, Mar 31 '14 at 14:24
It is a socket based service with a state in RAM. the "backup machine" should be a hot spare: actually I don't know, the right architecture is my question indeed. I need just my service switch in few seconds — allergique, Apr 03 '14 at 09:40

score 0 · Accepted Answer · answered Mar 31 '14 at 14:37

There are automation options you could develop with the EC2 API, but you would need an always online machine to do it.

The preferred scenario within ec2 is to have a load balancer send traffic between two machines using a shared nothing architecture (This means persistent data would be on s3, or database that is not on the instance).

If your application does not allow for this, you can set up a backup instance which will health check your primary instance. Using a custom script, if the health check rules fail, you would remap and elastic IP address to the backup instance and then terminate and relaunch your primary instance. Once the health check works again, you could automatically return the ip to the primary instance. This would probably be easier to set up in VPC rather than classic as you would have control of private IP addresses.

Single fault tolerant machine with amazon AWS

1 Answers1