3

I'm new to CosmosDB and I'm trying to grasp the R/U's settings limits.

Situation

Within an ASP.NET Core 2.1 application, I want to insert +/- 3000 documents at once. Looping these items and adding them one by one takes minutes of time. So bulk might be the way to go. I've followed some resources like:

Bulk import data to Azure Cosmos DB SQL API account by using the .NET SDK on learn.microsoft.com

Introducing Bulk support in the .NET SDK

In my code I used the sample code from the blog.

List<Task> concurrentTasks = new List<Task>();
foreach (var entity in entities)
{
    entity.Id = GenerateId(entity);

    concurrentTasks.Add(Container.CreateItemAsync(entity, new PartitionKey(entity.RoleId)));
}

await Task.WhenAll(concurrentTasks);

Inserting one document costs about 6 R/U's from my local development machine into Azure.

When I provision the default 400 R/U's a second, I quickly get 429 Too Many Requests exceptions. When I switch to Autoscale, it's finished within about 20 seconds without exceptions.

My question is: what if I want to limit the R/U's and still using this concurrentTasks approach, will there be retry handling done by the SDK? Or do I need to write my own 429-retry.

hIpPy
  • 4,649
  • 6
  • 51
  • 65
JonHendrix
  • 933
  • 15
  • 28
  • You should be able to use Fiddler to determine if retries are happening. BTW, I have had great success with using [this stored procedure](https://azurecosmosdb.github.io/labs/dotnet/labs/07-transactions-with-continuation.html) for bulk operations. It is lightning fast and happens server-side so no 429's. – Crowcoder Jul 28 '20 at 11:53

1 Answers1

4

Bulk in V3 does apply retries on 429 (you can verify this by taking a look at the Diagnostics property in any of the operations).

The amount of retries is governed by CosmosClientOptions.MaxRetryAttemptsOnRateLimitedRequests (default 9). The fact that you are getting the error it means the SDK already retried 9 times. You can increase this value and keep the SDK retrying (which will take longer).

The fact that enabling Autoscale helped, means the load of data you want to push in is too high for the provisioned throughput (400 RU as you mention). Autoscale will detect the throttling and increase the provisioned throughput to acomodate the load.

Matias Quaranta
  • 13,907
  • 1
  • 22
  • 47
  • 1
    Thank you and AnuragSharma. This helped me a lot. I tweaked with the retry and timeout options together with the RU limit settings in CosmosDB scale settings. It gave me a clear picture of how Cosmos works with these limits. When I increased to 20 retries and 2 minutes, 400RU worked but it just took a long time. Increasing to 1500 RU limit and keeping default SDK settings also worked. So it all comes down to fine-tuning and expectations about performance/throughput and pricing. – JonHendrix Jul 29 '20 at 07:27