0

At specific given point in time, if we refer azure portal metrics option against our cosmos db account it shows 429 happened many times but c# code written (for same cosmos db account/database/collection) using cosmos sql sdk, not throwing 429 even for single time because of that RUs scale up logic written for the every occurrence of 429/throttle issue is not getting triggered at all. Confusion is whether throttle happened like azure portal shows or it is projecting wrong data. Which one to consider as true, portal data or c# code behavior? If portal is correct why c# code not raising the same issue? any suggestions please.

exception catch logic:

 catch (DocumentClientException ex)
                {
                    if (ex.StatusCode == (HttpStatusCode)429)
                    {
                        //RU scale up logic
                    }
                }
191180rk
  • 735
  • 2
  • 12
  • 37

1 Answers1

0

The SDK will retry automatically upon receiving a 429 up to a number of times defined in the RetryOptions.MaxRetryAttemptsOnThrottledRequests.

If you don't want the SDK to retry, you can set this value to 0, and any 429s will get thrown to your user code:

ConnectionPolicy connectionPolicy = new ConnectionPolicy();
connectionPolicy.RetryOptions.MaxRetryAttemptsOnThrottledRequests = 0;
DocumentClient client = new DocumentClient(new Uri("service endpoint"), "auth key", connectionPolicy);
Matias Quaranta
  • 13,907
  • 1
  • 22
  • 47
  • Providing our application NOT that much busy all times,what would be the recommended approach, 1.whether to scale up RUs on very first occurance of 429? (or) 2.Doing RU scale up after waiting for cosmos sdk default retries of 9 times? What would be pros & cons of approach 1 & 2? please suggest. – 191180rk Oct 24 '19 at 16:18
  • 1
    Cosmos let's you scale up and down pretty fast programmatically. But it really depends on your business. If 429s are very random, and you don't provide a business SLA that needs to be maintained, then letting the SDK retry is just fine. But if you are constantly getting 429s, then that means you are underprovisioned on a normal basis, you need to either raise the RUs or investigate if you can optimize your operations to reduce RU usage. I wouldn't scale up for a single 429 though, but if you implement an algorithm that detects a spike of 429s and then scales up, that could also work. – Matias Quaranta Oct 24 '19 at 16:24