-1

I have an Orchestration that takes 100 search terms. Batches these search terms in a batch of 10 and fans out to start the search activities (each activity takes 10 names).

A search activity sequentially processes each name. For each name, it makes 2 search requests to azure search. One with space and punctuation and another without. To make the search request I call the REST API of the azure search.

The orchestration waits for all the search activities to resolve and return the result.

The issue I am facing is that the round trip for the azure search HTTP request is taking too long in the function app when deployed on azure.

At the start of the search, it takes 3-4 seconds for each request. But after few requests, the time for a single request goes up to 17-20 seconds.

Locally when I run this Orchestration with the same input and request to the same azure search index, it does not take more than 1.5 - 2 second for each request. Locally it takes 1.0-1.2 minutes for the Orchestration to complete. But deployed app takes 7-8 minutes for the same input and request to the same azure search index.

the following is how I make the request (code for the search activity funtion):

const request = require('request');

const requestDefault = request.defaults({
    method: 'GET',
    gzip: true,
    json: true,
    timeout: `some value`,
    time: true, 
    pool: {maxSockets: 100}
  });


module.exports = async function (context, names) {
    let results = [];
    for (let i = 0; i < names.length; i++) {
        results.push(await search(context, names[i]));
        results.push(await search(context, withOutSpaceAndPunctuations(names[i])));
    }
    return results;
}

function search(context, name) {
    let url = createAzureSearchUrl(name);
    return (new Promise((resolve, reject) => {
        requestDefault({
            uri: url,
            headers: { 'api-key': `key` }
        }, function (error, response, body) {
            if (!error) {
                context.log(`round trip time => ${response.elapsedTime/1000} sec`);
                context.log(`elapsed-time for search => ${response.headers['elapsed-time']} ms`);
                resolve(body.value);
            } else {
                reject(new Error(error));
            }
        })
    }));
}

function createAzureSearchUrl(name) {
 return `azure search url`;
}

The Orchestration

const df = require("durable-functions");

module.exports = df.orchestrator(function* (context) {
    let names = context.bindings.context.input;

    let chunk = 10;
    let batches = [];
    for (let i = 0; i < names.length; i += chunk) {
        let slice = names.slice(i, i + chunk);
        let batch = [];
        for (let j = 0; j < slice.length; j++) {
            batch.push(slice[j]);
        }
        batches.push(batch);
    }

    const tasks = [];
    for (let i = 0; i < batches.length; i++) {
      tasks.push(context.df.callActivity("Search", batches[i]));
    }

    let searchResults = yield context.df.Task.all(tasks);
    return searchResults;
});

The elapsed-time for search is always less than 500 milliseconds.

According to this documentation I removed the request module and used the native https module. But it had no improvement.

var https = require('https');
https.globalAgent.maxSockets = 100;


function searchV2(context, name) {
   let url = createAzureSearchUrl(name);
    const t0 = performance.now();
    return (new Promise((resolve, reject) => {
        let options = {headers: { 'api-key': 'key' }}
        https.get(url, options, (res) => {
            onst t1 = performance.now();
            context.log(`round trip time => ${(t1-t0)/1000} sec`);
            context.log(`elapsed-time => ${res.headers['elapsed-time']}`);
            res.on('data', (d) => {
                resolve(d);
            });
        });
    }));
}

For testing, I changed the batch count from 10 to 100 so that a single search activity processes all 100 search terms sequentially. Here all requests to azure search took 3.0-3.5 seconds. But 3.5sec * 200 req = 11.6666666667 minutes. So not fanning out is not an option.

The deployed app had a 1 instance count. I updated it to 6 instances. With 6 instances now takes 3.5 - 7.5 seconds for a single request. The total time for 100 search terms now takes 4.0 - 4.3 minutes. increasing instances to 6 had quite a lot of improvement. But still, it's taking 7.5seconds for a lot of requests. maxConcurrentActivityFunctions parameter was 6 in the host file.

I updated the instance count to 10 and maxConcurrentActivityFunctions to also 10. but it still takes 4.0 - 4.3 minutes for 100 search terms. No improvement. I saw a lot of requests taking up to 10 seconds.

I do not think it is a code-level issue. It has something to do with fanning out and making multiple concurrent requests for the same function.

Why is this happening to the deployed app and not locally? What should I do to decrease the request latency? Any suggestion will be appreciated.

My function app runs on the azure function App Service plan. My DurableTask version is 1.7.1

Nafis Islam
  • 1,483
  • 1
  • 14
  • 34
  • 1
    it seems that your code is not doing fan-out. You are awaiting for each search request --- `await search(context, names[i])`. You can create the tasks and add into tasks collection and await for the task collection operations to complete so that requests are processed in parallel. Read about Azure function fan-out example at [here](https://learn.microsoft.com/en-us/azure/azure-functions/durable/durable-functions-cloud-backup?tabs=csharp#e2_backupsitecontent-orchestrator-function). Not sure about the syntax in node.js; but you can use task as shown in the mentioned documentation. – user1672994 Jul 31 '21 at 13:51
  • @user1672994 The code I posted above, is the code of activity function. The orchestration calls this function in the fan-out pattern, each function taking 10 names. The `search(context, names[i])` function is called from the activity function, where I make 10 search requests sequestialy. – Nafis Islam Jul 31 '21 at 14:21
  • 1
    You are awaiting for each request ---> `await search(context, names[i])`? In you posted code I see `await`.. – user1672994 Jul 31 '21 at 14:34
  • My question is why each request is taking 17-20 seconds to resolve where locally it takes 1.5 - 2 seconds. I am making requests to the same endpoint locally and from the function app? I have added the orchestration part of the code – Nafis Islam Jul 31 '21 at 14:39
  • @user1672994 I have also tried making concurrent request with `await = promise.all(results);` where `let result = [search(name[1]), search(name[2]), ..., seach(name[10]))]`. Still it took 17-20 seconds for each response. – Nafis Islam Jul 31 '21 at 14:41
  • How much data is return in each request? Can you also measure the start and end time of each request? The code running faster locally implies there's some difference in environments. – 8163264128 Aug 05 '21 at 19:04

1 Answers1

0

The latency increases when there is indexing also happening in parallel. Is that the case for you? elapsed-time for the query may not be taking the latency into account.

On the Azure portal, when you navigate to your search resource, if you go to the monitoring tab, you should be able to see the latency, number of queries, % of throttled queries. That should provide some direction. What tier is your search service on? What is the number of partitions and what replicas that you provisioned for your search service?

As a test, you can increase the number of replicas and partitions to see if that helps with your performance. It did for me.

Michael Scott
  • 540
  • 2
  • 8