I have written a grpc server that contains multiple rpc services. Some are unary and some are server side streaming.
It connects to a grpc kubernetes server so I am using the python kubernetes client to query the server
Currently I am having some performance problems as I think if there are multiple request coming in that it buffers for every worker to finish up before it can serve the incoming request.
def startServer():
global server
server = grpc.server(futures.ThreadPoolExecutor(max_workers=10))
servicer_grpc.add_Servicer_to_server(Servicer(), server)
server.add_insecure_port('[::]:' + str(port))
server.start()
My questions are:
How can I improve my performance? Will adding more max_workers in the threadpoolexecutor helps?
How can I diagnose the problem and isolate which is causing the slowdown?
I am thinking if the size of the response matters in this case as I am streaming bytestring to the client. Is there a way to measure the size of the response or does it matter in python grpc?
I would like to know how do you diagnose your python grpc server so that you would know where to improve?