0

Would like to understand if there exists a library or some alternative mechanism via which I can resume/restart the execution of the consumer when there are messages in the SQS queue and suspend/sleep them when there are no more messages in the SQS queue to consume.

As of now, the consumer is always running via a while(1) loop. I am looking for a way to restart/suspend the execution of the consumers to improve on their performance of the consumers.

My application is scheduler based and runs after 12 hours. Before the next schedule, the consumers remain idle for almost 4-5 hours.

  • Any reason why you don't want to check the queue repeatedly? Where are you running the consumer? – William Wang Aug 27 '22 at 14:55
  • The fact that the consumers lie idle for such long periods of time is what I am trying to avoid. The consumers are running as fargate instances – BackEndMacavity Aug 27 '22 at 16:37
  • `improve on their performance - how? – Ermiya Eskandary Aug 27 '22 at 17:43
  • @ErmiyaEskandary, there could be possible scenarios wherein the consumers are unable to handle an exception as a corner case and stop execution. This way the consumer count would keep dropping one at a time and this is what I meant by degradation in performance. Even if that isn't a performance issue that still needs to be factored in. Not all scenarios can be accounted for. – BackEndMacavity Aug 27 '22 at 19:41

1 Answers1

0

The following setup could be used.

  1. To turn off the Fargate cluster when the queue hits zero. Create a cloud watch alarm on the SQS ApproximateNumberOfMessagesVisible Metric. The alarm could be set for example if the metric sees there are no messages available in the queue for X data points consecutively. In an alarm state, it should trigger a lambda which then turns off the Fargate cluster(by setting desired tasks as zero). Hence Zero consumers

  2. As you mentioned, you have a schedule for your tasks. So also create a CW rule using cron for the same schedule which again triggers a new lambda which is responsible for turning on the Fargate cluster (by setting non zero value for desired tasks).

  • While this indeed sounds like a good solution, this would involve having a couple of lambdas in place, and additional costs. Also what I am trying to factor in are failure scenarios that might need a stop and restart. Is that something that is possible to do in this solution? Also would there be a reliable way to loop through without a while 1 in python? – BackEndMacavity Aug 27 '22 at 19:45
  • 2 lambda's which I can currently think of. Additional costs: Nothing when we compare it to cost of running Fargate instances (3-4 hours). What failure scenarios do you want to factor in.? Why would you want to loop with while 1 just to start and stop Fargate clusters.? Retry with exponential backoff should be fine I guess. – Jagraj Singh Aug 28 '22 at 07:41
  • Sorry, I misjudged the while 1 situation. But it seems it has no alternative since for SQS we have to follow Polling mechanisms. – Jagraj Singh Aug 28 '22 at 10:13