3

I am attempting to export my model for serving via SavedModel and am running into issues on the serving client on the do inference call.

error: 
grpc.framework.interfaces.face.face.AbortionError: AbortionError(code=StatusCode.INVALID_ARGUMENT, details="Missing ModelSpec")

I have 4 placeholders that must be provided, the input, dropout (will be 1.0 for inference), and (a terrible workaround) for loading the pre-trained GoogleW2V vector embeddings (due to their size, I must feed them in via a placeholder to the graph for the embedding lookup. I split them in half because in the serving client I have to create a tensor_proto to feed to the placeholder and you cannot create tensor proto's larger than 2 GB. So as a workaround I break them in half and concact them in the graph)

In my model I save the signature as follows (I have tried to make it as simple as possible to help figure out how to use multiple inputs).

builder = saved_model_builder.SavedModelBuilder(export_path)

tensor_info_x = utils.build_tensor_info(model._input)
tensor_info_dropout = utils.build_tensor_info(model._dropout)
tensor_info_emb1 = utils.build_tensor_info(model._embeddingPlaceholder1)
tensor_info_emb2 = utils.build_tensor_info(model._embeddingPlaceholder2)
tensor_info_y = utils.build_tensor_info(model.softmaxPredictions)

prediction_signature = signature_def_utils.build_signature_def(
inputs={'inputs': tensor_info_x,
        'dropout': tensor_info_dropout,
        'googlew2v1': tensor_info_emb1,
        'googlew2v2': tensor_info_emb2
 },
 outputs={'softmaxPredictions': tensor_info_y},
 method_name=signature_constants.PREDICT_METHOD_NAME)

builder.add_meta_graph_and_variables(
    tfSession,
    [tag_constants.SERVING],
    signature_def_map={
       'softmaxPredictions': prediction_signature
})

builder.save()

In the client I do the inference:

def do_sem_inference(vs, data):

    host, port = CONFIG.semserver.split(':')
    channel = implementations.insecure_channel(host, int(port))
    stub = prediction_service_pb2.beta_create_PredictionService_stub(channel)
    request = predict_pb2.PredictRequest()
    request.model_spec.name = 'sem'
    request.model_spec.signature_name = 'softmaxPredictions'

    proto = tf.contrib.util.make_tensor_proto(data, dtype=tf.int32)
    request.inputs['inputs'].CopyFrom(proto)

    dropoutProto = tf.contrib.util.make_tensor_proto(1.0, dtype=tf.float32)
    request.inputs['dropout'].CopyFrom(dropoutProto)

    #####
    # This is the reason I have to break the GoogleW2V in half, tensor_proto cannot be larger than 2GB
    #####

    googlew2vProto1 = tf.contrib.util.make_tensor_proto(vs.wordVectors()[:1500000], dtype=tf.float32)
    request.inputs['googlew2v1'].CopyFrom(googlew2vProto1)
    googlew2vProto2 = tf.contrib.util.make_tensor_proto(vs.wordVectors()[1500000:], dtype=tf.float32)
    request.inputs['googlew2v2'].CopyFrom(googlew2vProto2)

    result_future = stub.Predict.future(request, 100.0)
    results = tf.contrib.util.make_ndarray(result_future.result().outputs['outputs'])

But I get the error (as show above):

Traceback (most recent call last):
  File "sem_client.py", line 121, in <module>
    tf.app.run()
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run
    _sys.exit(main(_sys.argv[:1] + flags_passthrough))
  File "sem_client.py", line 114, in main
    result = do_sem_inference(vectorSpace, embeddingLookup(vectorSpace, sentence))
  File "sem_client.py", line 66, in do_sem_inference
    results = tf.contrib.util.make_ndarray(result_future.result().outputs['outputs'])
  File "/usr/local/lib/python2.7/dist-packages/grpc/beta/_client_adaptations.py", line 112, in result
    raise _abortion_error(rpc_error_call)
grpc.framework.interfaces.face.face.AbortionError: AbortionError(code=StatusCode.INVALID_ARGUMENT, details="Missing ModelSpec")

I've scoured the net for help on using multiple inputs for a signature, some say to use exporter.generic_signature, but the source code says this is deprecated and to use SavedModel instead. I have not seen any clear examples that show how to even use the generic_signature. I have also found no examples that use multiple inputs to a signature via SavedModel, any idea how to get this to work?

Thanks for the help and advice.

P.S. I am also interested in ideas to avoid having to break the googleW2V embeddings in half and feed them in via a placeholder (again, due to the need to serve this model). The goal is to lookup the embeddings in the googleW2V and use them in my model (I am aware of ways to do this without using the google w2v embeddings but I would prefer to use these).

Andrew Schenck
  • 113
  • 1
  • 8
  • How do you start your tensorflow-serving? Can I see you start command? The expection means that you didn't pass model_spec, but in your code you did set that value, so I'm not very sure of your problem; P.S. your can load embeddings as Variable and export them as part of your inference graph. If they are too large, you can break them to partitions. – Tianjin Gu Jun 21 '17 at 00:00
  • `tf-serving/serving$ bazel-bin/tensorflow_serving/model_servers/tensorflow_model_server --port=9002 --model_name=sem --model_base_path=/home/dev/sem/export/CW11Sem/` And right now for the embeddings I am loading them as a variable. I make a variable and a placeholder, then I assign the placeholder to the variable and feed the placeholder via a the feed_dict : `self.X = tf.Variable(tf.constant(0.0, shape=[1500000, 300]), trainable=False, name="X") self._embeddingPlaceholder1 = tf.placeholder(tf.float32, [1500000,300]) self.set_x = self.X.assign(self._embeddingPlaceholder)` – Andrew Schenck Jun 21 '17 at 12:40
  • couldn't edit the typo I see, the embedding code is correct ignore the missing 1 on self._embeddingPlaceholder at the end – Andrew Schenck Jun 21 '17 at 12:46
  • So I am more convinced the issue is infact due to the multi-input signature. I was able to remove/handle the extra placeholders (so I only have input as a placeholder) and I am able to export, server, and get a result from the same exact model minus the extra placeholders (and therefore signature inputs). I will attempt to get this working but would still love some help with the extra placeholders and input signature issue... – Andrew Schenck Jun 21 '17 at 17:20
  • You can init a Variable by an array rather than assign them by placeholder, if you do so there will be only 2 placeholders: tensor_info_x and tensor_info_dropout, and 2 inputs items in signature when exporting model. – Tianjin Gu Jun 21 '17 at 23:56
  • @TianjinGu Hum not exactly sure what you mean, could you explain it a bit more? The biggest issue is that the variable/placeholder is very large (larger than 2 GB). Do you mind sharing some pseudo-code on initing a Variable by an array. – Andrew Schenck Jun 22 '17 at 13:48
  • would you please share the init Variable part of your training code, I will give an example based on your code. – Tianjin Gu Jun 22 '17 at 23:23
  • The variable which feeds a placeholder: `self.X = tf.Variable(tf.constant(0.0, shape=[3000000, 300]), trainable=False, name="X") self._embeddingPlaceholder = tf.placeholder(tf.float32, [3000000,300], name="googlew2v") self.set_x = self.X.assign(self._embeddingPlaceholder)` Init: `tfSession.run(tf.global_variables_initializer()) tfSession.run(model.set_x, feed_dict={model._embeddingPlaceholder: vectorSpace.wordVectors()})` – Andrew Schenck Jun 23 '17 at 15:36
  • @TianjinGu its based on [this question](https://stackoverflow.com/questions/35394103/initializing-tensorflow-variable-with-an-array-larger-than-2gb), if that helps. I still would need to provide the placeholder during serving which is the issue. During training its no problem but serving will not allow me to provide such a large dataset to the placeholder via gRPC. – Andrew Schenck Jun 23 '17 at 15:43

2 Answers2

0

So I think I figured out what is happening and the issue at hand and its kind of silly once I sit and think about it.

The googlew2v vector embeddings are on the order of ~3GB. In my serving client I am attempting to set the inputs tensor for the embeddings then make an RPC call... there must be some safeguard to not allow me to attempt to RPC that much data (do'h). The error was not very clear about this but once I randomly used a small sub-set of the embeddings it worked no problem.

I think I can come up with a workaround for my use-case. But as it is now I would not be able to have the googlew2v embeddings be a placeholder in my model (otherwise I must supply them during servering and it won't allow me to send the very large data via RPC. And even if it did it would take forever).

I have a workaround in my project now (I just do the embedding lookups etc before training and serving). But if its possible would like the option to include the Googlew2v embeddings in my model via a variable or something and not have to have them be a placeholder (such that I can take advantage of the tf.nn.embedding_lookup parallelism and speed).

Andrew Schenck
  • 113
  • 1
  • 8
0

You can use asset in your builder when you export pb model, and you can use able = lookup.index_table_from_file(vocab_path) when you define your model with input_x = tf.placeholder(tf.string, ... )