0

I am using grpc for bidirection streaming service. Beause this is an online real-time service, and we don't want our clients to wait too long, so we want to limit the rpcs to a certain number and reject the extra requests.

We are able to do it in Python with following code:

grpcServer = grpc.server(futures.ThreadPoolExecutor(max_workers=10), maximum_concurrent_rpcs=10)

In such case, only 10 requests will be processed, and other request will be rejected, the clients will throw a RESOURCE_EXHAUSTED error.

However, we find it hard to do it in C++. We use code following grpc sync server limit handle thread

ServerBuilder builder;
builder.SetSyncServerOption(ServerBuilder::SyncServerOption::NUM_CQS, 10);
grpc::ResourceQuota quota;
quota.SetMaxThreads(10 * 2);
builder.SetResourceQuota(quota);

We are using grpc in many services. Some using kaldi, some using libtorch. In some cases, the above code behaves normally, it processes 10 requests at a time, and rejects other requests. and the processing speed (our service requires a log of cpu calculation) is ok. In some cases, it only accepts 9 request at a time. In some cases, it accepts 10 request at a time, but the processing speed is significantly lower than before.

We also tried

builder.AddChannelArgument(GRPC_ARG_MAX_CONCURRENT_STREAMS, 10);

But it is useless becuase GRPC_ARG_MAX_CONCURRENT_STREAMS is the max_concurrent_rpcs on a single http connection.

Could someone please point out the equivalent C++ version of the Python code? We want to make our service to handle 10 requests at a time, and rejects other request, we do not want any service to wait in the queue.

  • grpc::ResourceQuota::SetMaxThreads is the right way to do this. GRPC_ARG_MAX_CONCURRENT_STREAMS only configures the max outstanding number of RPCs on each connection (but there is still no limit on the number of connections). I don't fully understand the problems described with the ResourceQuota approach - but I suggest going with and then investigating followup issues (if you're seeing performance problems etc., you may want to file an issue on github.com/grpc/issues – apolcyn Jan 20 '22 at 20:24
  • We have 4 speech related service. service A/B/C using kaldi( a speech recognition tool), service D using libtorch. service D suffers from performance degradation after SetMaxThreads. I doubt this maybe related to the multi-threading technique used in libtorch/MKL. Service A/B is okay after SetMaxThreads. But service C can only support 9 requests, which is exactly minus 1 (If I set NUM_CQS 20, and SetMaxThreads 40, it support 19 requests). What does SetMaxThreads limits? Does it take into account all the threads in grpc server, or it only considers the queues and processing thread? – Xiang Lyu Jan 21 '22 at 04:25

0 Answers0