0

As part of a recent DR exercise, an availability zone was simulated to have become unavailable. During the exercise, ECS tasks kept trying to start tasks in the "failed/unavailable" AZ.

Is it possible to prevent this situation from happening?

An idea was proposed to use a parallel process to update the ECS tasks with a placementConstraint directive that excluded the unavailable AZ. However, relying on an active process during a disaster seems like a recipe for, well, disaster.

Is it possible to use a static placement constraint that is in place before the disaster event? In other words, is it possible to say "if AZ is unavailable then don't try to start tasks in that AZ".

Thank you

acjca2
  • 152
  • 8

1 Answers1

2

Thanks for sharing this scenario that you and your team were looking into! This is a new feature that we are exploring for our customers and we will share more details as we can.

It would be great to know a little bit more about how you performed your disaster recovery simulation. I'd be curious to hear some high level details on how you simulated an AZ becoming unavailable.

Thanks,

The AWS ECS team

  • If you can provide an AWS email address for us to send some details to, that would be great. I can't provide any further details in public, but I'd like to find out more about what the ECS team is doing with regards to DR. – acjca2 Apr 05 '23 at 13:43