0

I have a system where a Lambda is triggered with event source as an SQS Queue.Each message gets our own internal unique id to differentiate between two requests .

Now lambda deletes the message from the queue automatically after sqs invocation and keeps the message in inflight while processing it so duplicate processing of a unique message should never occur ideally.

But when I checked my logs a message with the same unique id was processed within 100 milliseconds of the time frame of each other. So This seems like two lambdas were triggered for one message and something failed at the end of aws it was either visibility timeout or something else.I have read online that few others have gone through the same situation.

Can anyone who has gone through the same situation explain how did they solve it or people with current scalable systems who don't have this kind of issue can help me out with the reasons why I could be having it ?

Note:- One single message was successfully executed Twice this wasn't the case of retry on failure.

  • Can you make sure your lambda doesn't return error? – Thunderbolt Engineer Oct 19 '20 at 13:22
  • Related: https://medium.com/@piyush.jaware_25441/avoiding-continuous-lambda-retries-with-sqs-af22138b9eee – jarmod Oct 19 '20 at 13:34
  • @ThunderboltEngineer Yes the lambda didn't return any error the message was processed twice without any error – Rishi Ambwani Oct 20 '20 at 18:25
  • @jarmod Its actually not related the author is actually talking about the case where the message fails and its retried what I am saying a single message was processed twice without the first one falling.The first one executed successfully and at the same time another was processed the different was in hundreds milliseconds. – Rishi Ambwani Oct 20 '20 at 18:27

1 Answers1

0

I faced a similar issue, where a lambda (let's call it lambda-1) is triggered through a queue, and lambda-1 further invokes lambda-2 'synchronously' (https://docs.aws.amazon.com/lambda/latest/dg/invocation-sync.html) and the message basically goes to inflight and return back after visibility timeout expiry and triggers lambda-1 again. This goes on in a loop.

As per the link above:

"For functions with a long timeout, your client might be disconnected during synchronous invocation while it waits for a response. Configure your HTTP client, SDK, firewall, proxy, or operating system to allow for long connections with timeout or keep-alive settings."

Making async calls in lambda-1 can resolve this issue. In the case above, invoking lambda-2 with InvocationType='Event' returns back, which in-turn deletes the item from queue.