1

We have a mobile app backend server using Elastic Beanstalk autoscaling with 4 t2.small instances.

When we send out push notifications, it causes a large short-lived spike in traffic to the servers. Since autoscaling takes ~3 minutes to kick in, it's fairly useless.

enter image description here

How can we reduce the latency during these spikes without burning excessive CPU/$ during the lower traffic times?

jaynp
  • 235
  • 3
  • 9
  • 2
    What about... sending out your push notifications over a longer period of time? – Michael - sqlbot Sep 07 '17 at 23:27
  • @Michael-sqlbot Real-time is important in this case. – jaynp Sep 08 '17 at 18:46
  • Without knowing what your application is doing and a lot of detail about it, it's impossible to say what you might need to do. @Tim's answer seems promising, but also... API Gateway and Lambda are promising, and CloudFront caching and deduplication of identical requests during traffic spikes is promising... but it's hard to say which of those might be applicable. – Michael - sqlbot Sep 08 '17 at 20:02

1 Answers1

2

I don't think you can rely on auto scaling. AWS has a page on manual scaling which you should read.

You could make use of schedule scaling, set up to scale before your notifications go out.

You could simply start more servers manually, add them to the load balancer, and stop them manually when they're no longer required. This can be done with the console or using a script that calls the API.

You could change the minimum group size using the console or API before you send a notification.

Tim
  • 31,888
  • 7
  • 52
  • 78