Auto-scaling processors based on SQS, with different backlog SLOs

Question

I have this problem where we may have different SLOs (service level objectives) based on request. Some requests we want to process within 5 minutes, and some requests we can take longer to process (like 2 hours etc).

I was going to use Amazon SQS as a way to queue up the messages that need to be processed, and then use Auto-scaling to increase resources in order to process within allotted SLO. For example, if 1 machine can process 1 request every 10 seconds, then within 5 minutes I can process 30 messages. If I detect in the queue that the number of messages is > 30, I should spawn another machine to meet 5-minute SLO demand.

Similarly, if I have a 2-hour SLO, I can have a backlog as large as 720 before I need to scale up.

Based on this, I can't really place these different SLOs into the same queue, because then they will interfere with each other.

Possible approaches I was considering:

Have an SQS queue for each SLO, and auto-scale accordingly.
Have multiple message groups (one for each SLO), and then auto-scale based on message group.

Is (2) possible, I couldn't find documentation on that? If they are both possible, what are the pros and cons to each?

SQS is scaled by AWS, and you have no control over that. Not sure what do you want to do with "auto-scaling" in SQS? — Marcin, Aug 31 '22 at 23:47
I updated the title to make it more precise. I was referring to this: https://docs.aws.amazon.com/autoscaling/ec2/userguide/as-using-sqs-queue.html basically - using the queue backlog as a way to scale processors. — de1337ed, Sep 01 '22 at 00:08
Is there any chance you can convert your backend to use an AWS Lambda function instead of an Amazon EC2 instance? This would be much more scalable. — John Rotenstein, Sep 01 '22 at 00:45
When you refer to 'message groups', are you referring to the `MessageGroupId` on messages in FIFO queues? That would only _limit_ processing since messages with the same MessageGroupId won't be processed if there are already some being processed by a worker. — John Rotenstein, Sep 01 '22 at 00:50

score 1 · Answer 1 · answered Sep 01 '22 at 00:48

If you have messages to process with different priorities, the normal method is:

Create 2 Amazon SQS queues: One for high-priority messages, another for 'normal' messages
Have the workers pull from the high-priority queue first. If it is empty, pull from the other queue.

However, this means that 'normal' messages might never get processed if there are always messages in the high-priority queue, so you could instead have a certain number of workers pulling 'high then normal', and other workers just pulling from 'normal'.

The absolutely better way would be to process the messages with AWS Lambda functions. The default concurrency limit of 1000 can be increased on request. AWS Lambda would take care of all scaling and capacity issues and would likely be a cheaper option since there is no cost for idle time.

Auto-scaling processors based on SQS, with different backlog SLOs

1 Answers1