5

We using cosmos DB and evaluating how the system will perform under load. With load as low as 7-9K request we are randomly getting Service is currently unavailable & Request Time out Error.

Has anyone else faced this issue?

Below is the code

CREATE CLIENT

private void Initialize()
    {
        Client = new DocumentClient(new Uri(cosmosDbConfiguration.EndPoint), cosmosDbConfiguration.Key, new ConnectionPolicy
        {
            RetryOptions = new RetryOptions() { MaxRetryAttemptsOnThrottledRequests = 10, MaxRetryWaitTimeInSeconds = 180 },

            ConnectionMode = ConnectionMode.Direct,
            ConnectionProtocol = Protocol.Tcp             
        });

        Client.OpenAsync().GetAwaiter();

        CreateDatabaseIfNotExistsAsync().Wait();
        CreateAllCollectionsIfNotExistsAsync().Wait();
    }

CREATE DOCUMENT

 public async Task<Document> CreateDocumentAsync<T>(string collectionName, T dataRow)
    {
        return await Client.CreateDocumentAsync(GetCollectionUri(collectionName), dataRow);
    }

Below are the error messages along with activity Id

Service is currently unavailable. ActivityId: 6bfcabb0-69b9-4671-b2d1-8bf1a831d77e, RequestStartTime: 2019-04-08T21:48:14.3002274Z, Number of regions attempted: 1 ResponseTime: 2019-04-08T21:48:48.3076636Z, StoreReadResult: StorePhysicalAddress: rntbd://cdb-ms-prod-westus1-fd4.documents.azure.com:14070/apps/112e82de-8353-4f6c-804f-e6ce36a8282f/services/9f5b24ad-6517-4487-855b-ac5e507a6f53/partitions/e24c749b-c701-4b75-9a16-6dde04028f12/replicas/131991744298606536p/, LSN: -1, GlobalCommittedLsn: -1, PartitionKeyRangeId: , IsValid: False, StatusCode: 410, IsGone: True, IsNotFound: False, IsInvalidPartition: False, RequestCharge: 0, ItemLSN: -1, SessionToken: , ResourceType: Document, OperationType: Query , documentdb-dotnet-sdk/2.2.1 Host/64-bit MicrosoftWindowsNT/6.2.9200.0

Message: Request timed out. ActivityId: 72f7f069-2376-4c71-af44-a78780e53894, Request URI: /apps/28ad6635-acc0-4a33-8cbf-513f2a7ecff0/services/9015ec89-5cc9-4a36-825d-047766c72037/partitions/bdc49db7-9018-4dd4-9100-52dcee4635a4/replicas/131992241935694598p/, RequestStats: RequestStartTime: 2019-04-08T21:59:06.5769926Z, Number of regions attempted: 1 , SDK: documentdb-dotnet-sdk/2.2.1 Host/64-bit MicrosoftWindowsNT/6.2.9200.0

Dijkgraaf
  • 11,049
  • 17
  • 42
  • 54
  • Are you surpassing your allocated RUs? – Stephen Cleary Apr 09 '19 at 00:45
  • 1
    Linking same discussion from MSDN https://social.msdn.microsoft.com/Forums/en-US/09baedc5-5ff0-4515-ad9f-e1dae14f6711/cosmos-db-request-time-out-service-is-currently-unavailable-error?forum=azurecosmosdb – Mike Ubezzi Apr 09 '19 at 01:19
  • The issue is with protocol.tcp mode versus protocol.https. TCP can process more messages per second at the client but the Cosmos DB service is unable to handle the load. If you connect with HTTPS, the per second message count is slower but the service remains stable. – Mike Ubezzi Apr 09 '19 at 16:15
  • The error seems to be coming from a Query, but in your code, you are not showing any Queries. Also, the methods are Async, I would advice to mark the Initialize as `async Task` and `await` those async calls instead of using `GetAwaiter` or `Wait`. For Request Timeouts see https://learn.microsoft.com/en-us/azure/cosmos-db/troubleshoot-dot-net-sdk#request-timeouts – Matias Quaranta Apr 09 '19 at 16:21
  • @MikeUbezziMSFT Thanks for your feedback and help here. We tried using the HTTP Protocol but with that, we were getting Task canceled exception intermediately under load. Do you have any idea on what is the best-recommended protocol/connection mode for services reading streaming data and putting to cosmos (We tried using stream Analytics job but we need to do some customization hence had to go with an ASF service) – nirag tibdewal Apr 10 '19 at 04:38
  • @StephenCleary Thanks for replying. That was my first guess as well, so tried increasing the RUs to 10000 from 1000 still got the error, Maybe there is something wrong in the code and as suggested by Matias will try changing the client initialization to async. – nirag tibdewal Apr 10 '19 at 04:44
  • 2
    @niragtibdewal Have you solved this issue? – twoflower May 29 '20 at 06:56

0 Answers0