Tensorflow Serving client-side batching

Question

Python: 3.6.6

Tensorflow: 1.10.0

Tensorflow-Serving: 1.10.0

I've seen multiple examples (like How to do batching in Tensorflow Serving?) that resolve this issue with something like the following code:

# Create Stub
channel = grpc.insecure_channel(FLAGS.server)
stub = prediction_service_pb2_grpc.PredictionServiceStub(channel)

image_data = []
for image in FLAGS.input_image.split(','):
    with open(image, 'rb') as f:
        image_data.append(f.read())

# Create prediction request object
request = predict_pb2.PredictRequest()

# Specify model name (must be same as when TF server started)
request.model_spec.name = 'inference'

# Initialize prediction
# Specify signature name should be same as specified when exporting model)
request.model_spec.signature_name = 'detection_signature'

request.inputs['inputs'].CopyFrom(
    make_tensor_proto(image_data, shape=[len(image_data)])
)

# Call the prediction server
results = stub.Predict(request, 10.0) # 10 secs timeout

However, I get the following error:

Traceback (most recent call last):
  File "client_batch.py", line 64, in <module>
    results = stub.Predict(request, 10.0) # 10 secs timeout
  File "/path/to/python3.6/site-packages/grpc/_channel.py", line 514, in __call__
    return _end_unary_response_blocking(state, call, False, None)
  File "/path/to/w/lib/python3.6/site-packages/grpc/_channel.py", line 448, in _end_unary_response_blocking
    raise _Rendezvous(state, None, None, deadline)
grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with:
 status = StatusCode.INVALID_ARGUMENT
 details = "Expects arg[0] to be uint8 but string is provided"
 debug_error_string = "{"created":"@1534265330.005987356","description":"Error received from peer","file":"src/core/lib/surface/call.cc","file_line":1095,"grpc_message":"Expects arg[0] to be uint8 but string is provided","grpc_status":3}"

After reading up on make_tensor_proto, I found that make_tensor_proto accepts "values" of a python scalar, a python list, a numpy ndarray, or a numpy scalar. So it appears the current version should support string scalar as in this example.

I was able to get the code to work for non-batching inputs by replacing

make_tensor_proto(image_data, shape=[len(image_data)])

with

make_tensor_proto(scipy.misc.imread(FLAGS.input_image), shape=[1] + list(img.shape))

Which specifies passes an ndarray and the exact shape of the input.

However, this does not appear to scale for multiple arrays. You end up getting an error like:

got shape [2,1120, 1152, 3], but wanted [2]

Is there a new way to do this in the newer version of tensorflow? Or perhaps something I'm clearly doing wrong.

Tensorflow Serving client-side batching

0 Answers0