AWS Lambda - dead letter queue for dead letter queue best practices

Question

I am having a lambda function for which I want to create an SQS dead letter queue. I started by creating the SQS in terraform:

resource "aws_sqs_queue" "my_lambda_dlq" {
  name                      = "my_lambda_dlq"
  delay_seconds             = 90
  max_message_size          = 2048
  message_retention_seconds = 86400
  receive_wait_time_seconds = 10
  redrive_policy = jsonencode({
    deadLetterTargetArn = aws_sqs_queue.terraform_queue_deadletter.arn
    maxReceiveCount     = 4
  })

  tags = local.default_tags
}

This is the example from terraform. However, I got stuck at redrive_policy.

Do I understand correctly, this sets a dead letter queue for the SQS queue?
If I set redrive_policy, that implies I am setting a DLQ on a DLQ. I get the feeling that one can set a DLQ on a DLQ on a DLQ and so on.

I was not able to find any best practices regarding this. Does anyone have any experience with this?

My main goal here is not to loose any messages. Thanks, Luminita

score 2 · Accepted Answer · answered Nov 30 '20 at 11:52

2

By specifying a redrive_policy you configure where the unprocessable / failing messages are supposed to be sent to. The queue where you send these messages to is called dlq / dead-letter-queue, but it will still be a normal queue.

And yes, a DLQ can once again have another DLQ since every DLQ itself is still just a queue. I cannot think of any situation where you would want to have that but nothing is stopping you from doing it.

"If I set redrive_policy, that implies I am setting a DLQ on a DLQ" - technically a dlq does not exist, AWS only knows queues. By having one queue configured as the other's dlq you do not change that both are queues. Any queue is a DLQ if it is configured as the redrive target of any other queue.

answered Nov 30 '20 at 11:52

luk2302

55,258
23
97
137

Thanks for your answer! I guess what I was after is: does anyone know what is the best practice for asynchronously-invoked lambdas when one doesnt want to loose any events? It is recommended to just use an SQS DLQ (without any DLQ on that)? I understand I can create a long chain of queues, but I am wondering how people that have lots of experience of running lambda in production recommend doing this in practice. – user2761217 Nov 30 '20 at 12:59
@user2761217 yes, you need a dlq for async invocations and then you do not need a dlq on that queue because why would you. You should then configure alerting on the DLQ so that you get notified of the message and then you have to deal with them manually. – luk2302 Nov 30 '20 at 13:00
well what happens with the message if for some reason it cant write it to the SQS? – user2761217 Dec 01 '20 at 10:09
@user2761217 Then you will get a spike in the `DeadLetterErrors` metric of the lambda and the message is lost. – luk2302 Dec 01 '20 at 10:10
Re. examples of chains of queues -- Uber uses this pattern of chained DLQs, but calls the intermediary ones "Retry Queues": https://eng.uber.com/reliable-reprocessing/. They run on Kafka but the pattern holds on AWS too. I could see a number of DLQs chained, with a delivery delay set for resilience against transient errors. – Valer Jan 06 '22 at 17:29

AWS Lambda - dead letter queue for dead letter queue best practices

1 Answers1