Retries in AWS SDK

Question

This doc indicates that using the aws sdk gives you out-of-the-box retry functionality.

Then this doc states

// The maximum number of times that a request will be retried for failures.
// Defaults to -1, which defers the max retry setting to the service
// specific configuration.
MaxRetries *int

which tells me that retry logic configuration happens at the service level.

If this is the case, I would expect individual services to document their retry logic configuration. Not only is this not the case in actuality, several (actions of) services seem to imply that they don't provide retry logic, and, instead, discuss how this might be implemented at the application level. See for example dynamodb:

If DynamoDB returns any unprocessed items, you should retry the batch operation on those items. However, we strongly recommend that you use an exponential backoff algorithm. If you retry the batch operation immediately, the underlying read or write requests can still fail due to throttling on the individual tables. If you delay the batch operation using exponential backoff, the individual requests in the batch are much more likely to succeed.

and sqs:

The receive request attempt ID is the token used for deduplication of ReceiveMessage calls.

During a long-lasting network outage that causes connectivity issues between your SDK and Amazon SQS, it's a best practice to provide the receive request attempt ID and to retry with the same receive request attempt ID if the SDK operation fails.

My question is how do I know when a service is handling the retry logic and when I have to implement it at the application level?

You don't know and, I suspect, you can't know. I think that option is just saying "Do you want the SDK to retry? Note that 1 or more services might also implement their own internal retry mechanism but this SDK knows nothing about that internal implementation detail." — jarmod, Jun 12 '20 at 15:35
Isn't that really confusing? As a user of these services, I'd love to know when the retries are being done for me by the service and when I need to implement them. — Isaac Kleinman, Jun 12 '20 at 16:05
I don’t know what’s actually happening here, of course, because it’s probably an implementation detail and is not documented but it wouldn’t be unreasonable to assume that any service on the web (whether it’s AWS or anyone else) might smooth out local, temporary issues with retries. That’s probably even more true of distributed systems. As a general rule, I rely on the AWS SDK’s default retry mechanism, and I don’t find much beyond that that is actually automatically retryable except for API throttling. — jarmod, Jun 12 '20 at 16:15
Seems like I'm not the only one confused: https://github.com/aws/aws-sdk-go/pull/2926 https://github.com/aws/aws-sdk-go/issues/3027 — Isaac Kleinman, Jun 12 '20 at 20:24

Retries in AWS SDK

0 Answers0