13

I Have the following code..

async function bulkInsert(db, collectionName, documents) {
  try {
    const cosmosResults = await db.collection(collectionName).insertMany(documents);
    console.log(cosmosResults);
    return cosmosResults
  } catch (e) {
    console.log(e)
  }

}

If I run it with a large array of documents I get ( not unexpectedly)

{ MongoError: Message: {"Errors":["Request rate is large"]}
  ActivityId: b3c83c38-0000-0000-0000-000000000000, 
  Request URI: /apps/DocDbApp/services/DocDbServer24/partitions/a4cb4964-38c8-11e6-8106-8cdcd42c33be/replicas/1p/, 
  RequestStats: , SDK: Microsoft.Azure.Documents.Common/1.19.102.5
    at G:\Node-8\NodeExample\node_modules\oracle-movie-ticket-demo\node_modules\mongodb-core\lib\connection\pool.js:596:61
at authenticateStragglers (G:\Node-8\NodeExample\node_modules\oracle-movie-ticket-demo\node_modules\mongodb-core\lib\connection\pool.js:514:16)
at Connection.messageHandler (G:\Node-8\NodeExample\node_modules\oracle-movie-ticket-demo\node_modules\mongodb-core\lib\connection\pool.js:550:5)
at emitMessageHandler (G:\Node-8\NodeExample\node_modules\oracle-movie-ticket-demo\node_modules\mongodb-core\lib\connection\connection.js:309:10)
at TLSSocket.<anonymous> (G:\Node-8\NodeExample\node_modules\oracle-movie-ticket-demo\node_modules\mongodb-core\lib\connection\connection.js:452:17)
at emitOne (events.js:116:13)
at TLSSocket.emit (events.js:211:7)
at addChunk (_stream_readable.js:263:12)
at readableAddChunk (_stream_readable.js:250:11)
at TLSSocket.Readable.push (_stream_readable.js:208:10)
name: 'MongoError',
message: 'Message: {"Errors":["Request rate is large"]}\r\nActivityId: b3c83c38-0000-0000-0000-000000000000, 
Request URI: /apps/DocDbApp/services/DocDbServer24/partitions/a4cb4964-38c8-11e6-8106-8cdcd42c33be/replicas/1p/, RequestStats: , SDK: Microsoft.Azure.Documents.Common/1.19.102.5',
_t: 'OKMongoResponse',
ok: 0,
code: 16500,
errmsg: 'Message: {"Errors":["Request rate is large"]}\r\nActivityId:      b3c83c38-0000-0000-0000-000000000000, 
Request URI: /apps/DocDbApp/services/DocDbServer24/partitions/a4cb4964-38c8-11e6-8106-8cdcd42c33be/replicas/1p/, 
RequestStats: , 
SDK: Microsoft.Azure.Documents.Common/1.19.102.5',
 '$err': 'Message: {"Errors":["Request rate is large"]}\r\nActivityId: b3c83c38-0000-0000-0000-000000000000, 
 Request   URI: /apps/DocDbApp/services/DocDbServer24/partitions/a4cb4964-38c8-11e6-8106-8cdcd42c33be/replicas/1p/, RequestStats: , 
SDK: Microsoft.Azure.Documents.Common/1.19.102.5' }

It appears that some (approx. 165) of the 740 records I was processing have been loaded. All of them appear to have been assigned '_id' attributes.

Does anyone have any idea how to handle this (or at least tell which records were inserted and which were not processes)...

Tom Sun - MSFT
  • 24,161
  • 3
  • 30
  • 47
mark d drake
  • 1,280
  • 12
  • 20

1 Answers1

-1

Requests with cosmosdb need to consume RUs. Obviously, your insert request exceeded the RU throughput and error code 16500 occurred.

Applications that exceed the provisioned request units for a collection will be throttled until the rate drops below the reserved level. When a throttle occurs, the backend will preemptively end the request with a 16500 error code - Too Many Requests. By default, API for MongoDB will automatically retry up to 10 times before returning a Too Many Requests error code.

You could find more instructions from official document.

You could follow the ways as below to try to solve the issue:

  1. Import your data in batches to reduce throughput.

  2. Add your own retry logic in your application.

  3. Increasing the reserved throughput for the collection. Of course, it increases your cost.

You could refer to this article.

Hope it helps you.


Update Answer:

It looks like your documents are not uniquely identifiable. So I think the "_id" attribute which automatically generated by Cosmos DB cannot determine which documents have been inserted and which documents have not been inserted.

I suggest you increasing throughput settings, empty the database and then bulk import the data.

Considering the cost , please refer to this document for setting the appropriate RU.

Or you could test bulk import operation locally via Cosmos DB Emulator.

Jay Gong
  • 23,163
  • 2
  • 27
  • 32
  • Jay. Thanks for your reply. I was trying to work out how to handle the exception in a graceful manner and retry the operations for the documents that were not inserted. Given the information in the exception this appears almost impossible. All of the documents have been assigned "_id" attributes (by the mongo client s/w), surely there has to be a better solution than testing to see if each of the assigned id's corresponds to a entry in the COSMOS table.... – mark d drake Jan 03 '18 at 20:47
  • I am not sure what is the meaning of all of the documents have been assigned "_id" attributes. Have these documents been inserted into database? – Jay Gong Jan 04 '18 at 06:19
  • It appears that the mongo client assigns an _id attribute to each document in the array before starting the insert. So, to answer your second question, when the error is thrown, all of the documents have been assigned an _id attribute but only some of them have been inserted into the database. Hence my problem, I need to determine (ideally from the exception) which documents have been inserted and which have not. I cannot use the precence of the _id attribute since this appears to have been injected into all of the documents before the insert operations were attempted – mark d drake Jan 04 '18 at 06:37
  • Jay.. AFAIK the _id is added by the Mongo Node API, not Cosmos. It really doesn't matter how much I increase my RU as sooner or latter given a large enough set of documents to bulk insert I will hit this error and need to be able to recover from it in a graceful manner. Clearing out the database and trying again is not an option, for a multi-user system. – mark d drake Jan 05 '18 at 04:55
  • @mark d drake were you able to figure out the solution ? – Prabhat Mishra Nov 21 '18 at 15:19
  • 4
    This is a no answer. CosmosDB should return operation that failed or the underlying API should retry as possible as with the insertMany – Martin Kosicky Apr 16 '19 at 05:25