2

I am using documentDB with Azure Function App. I have created a basic HTTPtrigger in JS to store(insert) entries in documentDB.

Data throughput for collection is set to 2500 (RU).

Here req.body is an array and req.body.length is around 2500 objects resulting in 1 MB of size, which I believe is fairly small.

module.exports = function (context, req) {
context.bindings.document = [];
if (req.body) {
    //if(req.body instanceof Array){context.log("It is an array");}
    context.bindings.document = req.body; // here document is function app parameter
    res = {status: 200};
}
else {
    res = {
        status: 400,
        body: "Pass Parameters"
    };
}
context.done(null, res);};

For every single request(POST) to function app, it takes avg around 30-40s to execute and to store values in collection, which is really long. And it results in connection timeout for parallel requests.

  • Is there any performance tweak that can be used with documentDB or Azure function app to lower the execution time?

  • How does Function App handles documentDB in background? Is it utilizing best practices?

I am familiar with bulk insert/update operations in other NoSQL, but couldn't find anything for documentDB.

Vish
  • 31
  • 4

1 Answers1

2

Our output bindings are enumerating the document array you give us and inserting documents one by one. That is fine for most cases, but if you have advanced bulk import requirements that might not work for you.

The DocumentDB client APIs don't support bulk insert operations in general, however those can be done case by case by writing server-side stored procedures (e.g. here). While Azure Functions can't use that stored procedure approach in general, if you needed to you could use the DocumentDB client SDK yourself to do that.

Another option you might explore is to take the initial array input in your http function and break it up into smaller groups of documents, push those out into an import Azure Queue (using our queue binding documentation here). You could then have a separate function that is monitoring that queue, and using the output binding as you have done above to import these smaller document sets. The queue would also give you scale out - multiple document sets would be imported in parallel.

Community
  • 1
  • 1
mathewc
  • 13,312
  • 2
  • 45
  • 53
  • How about using Azure Storage queue service with function app? Also, Is there any benchmark or limitations for concurrent documentDB write operations? – Vish Feb 13 '17 at 06:13
  • I suggested using a queue binding above - see link. We're just a thin layer over the DocumentDB client SDK, so as far as any potential concurrency limitations, you should see the DocumentDB documentation. – mathewc Feb 13 '17 at 16:24
  • About Azure queue binding, I am planning to use Service Bus queue binding, will it work? I have an array of 1 MB size and document says message size should be 256 KB, you mentioned splitting array into multiple documents, but do I need multiple functions in output for that? Cause I think function triggers output on context.done – Vish Feb 14 '17 at 02:05