I am attempting to export my model for serving via SavedModel and am running into issues on the serving client on the do inference call.
error:
grpc.framework.interfaces.face.face.AbortionError: AbortionError(code=StatusCode.INVALID_ARGUMENT, details="Missing ModelSpec")
I have 4 placeholders that must be provided, the input, dropout (will be 1.0 for inference), and (a terrible workaround) for loading the pre-trained GoogleW2V vector embeddings (due to their size, I must feed them in via a placeholder to the graph for the embedding lookup. I split them in half because in the serving client I have to create a tensor_proto to feed to the placeholder and you cannot create tensor proto's larger than 2 GB. So as a workaround I break them in half and concact them in the graph)
In my model I save the signature as follows (I have tried to make it as simple as possible to help figure out how to use multiple inputs).
builder = saved_model_builder.SavedModelBuilder(export_path)
tensor_info_x = utils.build_tensor_info(model._input)
tensor_info_dropout = utils.build_tensor_info(model._dropout)
tensor_info_emb1 = utils.build_tensor_info(model._embeddingPlaceholder1)
tensor_info_emb2 = utils.build_tensor_info(model._embeddingPlaceholder2)
tensor_info_y = utils.build_tensor_info(model.softmaxPredictions)
prediction_signature = signature_def_utils.build_signature_def(
inputs={'inputs': tensor_info_x,
'dropout': tensor_info_dropout,
'googlew2v1': tensor_info_emb1,
'googlew2v2': tensor_info_emb2
},
outputs={'softmaxPredictions': tensor_info_y},
method_name=signature_constants.PREDICT_METHOD_NAME)
builder.add_meta_graph_and_variables(
tfSession,
[tag_constants.SERVING],
signature_def_map={
'softmaxPredictions': prediction_signature
})
builder.save()
In the client I do the inference:
def do_sem_inference(vs, data):
host, port = CONFIG.semserver.split(':')
channel = implementations.insecure_channel(host, int(port))
stub = prediction_service_pb2.beta_create_PredictionService_stub(channel)
request = predict_pb2.PredictRequest()
request.model_spec.name = 'sem'
request.model_spec.signature_name = 'softmaxPredictions'
proto = tf.contrib.util.make_tensor_proto(data, dtype=tf.int32)
request.inputs['inputs'].CopyFrom(proto)
dropoutProto = tf.contrib.util.make_tensor_proto(1.0, dtype=tf.float32)
request.inputs['dropout'].CopyFrom(dropoutProto)
#####
# This is the reason I have to break the GoogleW2V in half, tensor_proto cannot be larger than 2GB
#####
googlew2vProto1 = tf.contrib.util.make_tensor_proto(vs.wordVectors()[:1500000], dtype=tf.float32)
request.inputs['googlew2v1'].CopyFrom(googlew2vProto1)
googlew2vProto2 = tf.contrib.util.make_tensor_proto(vs.wordVectors()[1500000:], dtype=tf.float32)
request.inputs['googlew2v2'].CopyFrom(googlew2vProto2)
result_future = stub.Predict.future(request, 100.0)
results = tf.contrib.util.make_ndarray(result_future.result().outputs['outputs'])
But I get the error (as show above):
Traceback (most recent call last):
File "sem_client.py", line 121, in <module>
tf.app.run()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "sem_client.py", line 114, in main
result = do_sem_inference(vectorSpace, embeddingLookup(vectorSpace, sentence))
File "sem_client.py", line 66, in do_sem_inference
results = tf.contrib.util.make_ndarray(result_future.result().outputs['outputs'])
File "/usr/local/lib/python2.7/dist-packages/grpc/beta/_client_adaptations.py", line 112, in result
raise _abortion_error(rpc_error_call)
grpc.framework.interfaces.face.face.AbortionError: AbortionError(code=StatusCode.INVALID_ARGUMENT, details="Missing ModelSpec")
I've scoured the net for help on using multiple inputs for a signature, some say to use exporter.generic_signature, but the source code says this is deprecated and to use SavedModel instead. I have not seen any clear examples that show how to even use the generic_signature. I have also found no examples that use multiple inputs to a signature via SavedModel, any idea how to get this to work?
Thanks for the help and advice.
P.S. I am also interested in ideas to avoid having to break the googleW2V embeddings in half and feed them in via a placeholder (again, due to the need to serve this model). The goal is to lookup the embeddings in the googleW2V and use them in my model (I am aware of ways to do this without using the google w2v embeddings but I would prefer to use these).