19

I would like to include my custom pre-processing logic in my exported Keras model for use in Tensorflow Serving.

My pre-processing performs string tokenization and uses an external dictionary to convert each token to an index for input to the Embedding layer:

from keras.preprocessing import sequence

token_to_idx_dict = ... #read from file

# Custom Pythonic pre-processing steps on input_data
tokens = [tokenize(s) for s in input_data]
token_idxs = [[token_to_idx_dict[t] for t in ts] for ts in tokens]
tokens_padded = sequence.pad_sequences(token_idxs, maxlen=maxlen)

Model architecture and training:

model = Sequential()
model.add(Embedding(max_features, 128, input_length=maxlen))
model.add(LSTM(128, activation='sigmoid'))
model.add(Dense(n_classes, activation='softmax'))
model.compile(loss='sparse_categorical_crossentropy', optimizer='adam')

model.fit(x_train, y_train)

Since the model will be used in Tensorflow Serving, I want to incorporate all pre-processing logic into the model itself (encoded in the exported model file).

Q: How can I do so using the Keras library only?

I found this guide explains how to combine Keras and Tensorflow. But I'm still unsure how to export everything as one model.

I know Tensorflow has built-in string splitting, file I/O, and dictionary lookup operations.

Pre-processing logic using Tensorflow operations:

# Get input text
input_string_tensor = tf.placeholder(tf.string, shape={1})
# Split input text by whitespace
splitted_string = tf.string_split(input_string_tensor, " ")
# Read index lookup dictionary
token_to_idx_dict = tf.contrib.lookup.HashTable(tf.contrib.lookup.TextFileInitializer("vocab.txt", tf.string, 0, tf.int64, 1, delimiter=","), -1)
# Convert tokens to indexes
token_idxs = token_to_idx_dict.lookup(splitted_string)
# Pad zeros to fixed length
token_idxs_padded = tf.pad(token_idxs, ...)

Q: How can I use these Tensorflow pre-defined pre-processing operations and my Keras layers together to both train and then export the model as a "black box" for use in Tensorflow Serving?

Qululu
  • 1,040
  • 2
  • 12
  • 23

2 Answers2

12

I figured it out, so I'm going to answer my own question here.

Here's the gist:

First, (in separate code file) I trained the model using Keras only with my own pre-processing functions, exported the Keras model weights file and my token-to-index dictionary.

Then, I copied just the Keras model architecture, set the input as the pre-processed tensor output, loaded the weights file from the previously trained Keras model, and sandwiched it between the Tensorflow pre-processing operations and the Tensorflow exporter.

Final product:

import tensorflow as tf
from keras import backend as K
from keras.models import Sequential, Embedding, LSTM, Dense
from tensorflow.contrib.session_bundle import exporter
from tensorflow.contrib.lookup import HashTable, TextFileInitializer

# Initialize Keras with Tensorflow session
sess = tf.Session()
K.set_session(sess)

# Token to index lookup dictionary
token_to_idx_path = '...'
token_to_idx_dict = HashTable(TextFileInitializer(token_to_idx_path, tf.string, 0, tf.int64, 1, delimiter='\t'), 0)

maxlen = ...

# Pre-processing sub-graph using Tensorflow operations
input = tf.placeholder(tf.string, name='input')
sparse_tokenized_input = tf.string_split(input)
tokenized_input = tf.sparse_tensor_to_dense(sparse_tokenized_input, default_value='')
token_idxs = token_to_idx_dict.lookup(tokenized_input)
token_idxs_padded = tf.pad(token_idxs, [[0,0],[0,maxlen]])
token_idxs_embedding = tf.slice(token_idxs_padded, [0,0], [-1,maxlen])

# Initialize Keras model
model = Sequential()
e = Embedding(max_features, 128, input_length=maxlen)
e.set_input(token_idxs_embedding)
model.add(e)
model.add(LSTM(128, activation='sigmoid'))
model.add(Dense(num_classes, activation='softmax'))

# Load weights from previously trained Keras model
weights_path = '...'
model.load_weights(weights_path)

K.set_learning_phase(0)

# Export model in Tensorflow format
# (Official tutorial: https://github.com/tensorflow/serving/blob/master/tensorflow_serving/g3doc/serving_basic.md)
saver = tf.train.Saver(sharded=True)
model_exporter = exporter.Exporter(saver)
signature = exporter.classification_signature(input_tensor=model.input, scores_tensor=model.output)
model_exporter.init(sess.graph.as_graph_def(), default_graph_signature=signature)
model_dir = '...'
model_version = 1
model_exporter.export(model_dir, tf.constant(model_version), sess)

# Input example
with sess.as_default():
    token_to_idx_dict.init.run()
    sess.run(model.output, feed_dict={input: ["this is a raw input example"]})
Qululu
  • 1,040
  • 2
  • 12
  • 23
  • 2
    FYI, the Layer method `set_input()` only works for Keras version 1.1.1. After that, it was removed. I can't figure out how to set the input of a layer to a Tensorflow tensor in later versions. If anyone does, please comment. – Qululu Feb 24 '17 at 07:59
  • 1
    Hi @Qululu, in Keras 2.0+, you can automatically call a Tensorflow tensor/placeholder with a Keras model/layer now (just like you would normally do with Keras layers/tensors, etc.)... For example, see this official page: https://blog.keras.io/keras-as-a-simplified-interface-to-tensorflow-tutorial.html... Hope this helps! ;) – user1414202 Aug 18 '17 at 18:42
  • How do you train the model with tf preprocessing? How do you call .fit() and how is the tf placeholder fed? – Daniel Nitzan Dec 13 '17 at 02:11
  • Looks like it is not supported for now https://github.com/keras-team/keras/issues/7503. For calling .fit(), I ended up knocking out the InputLayer from the model, and called fit() like this: `model.fit(token_idxs_embedding.eval(session=sess, feed_dict={x_input: X_train}), y_train...`. This materializes the TF placeholder. For inference, I put the InputLayer back in the model and saved it. – Daniel Nitzan Dec 13 '17 at 20:40
9

The accepted answer is super helpful, however it uses an outdated Keras API as @Qululu mentioned, and an outdated TF Serving API (Exporter), and it does not show how to export the model so that its input is the original tf placeholder (versus Keras model.input, which is post preprocessing). Following is a version that works well as of TF v1.4 and Keras 2.1.2:

sess = tf.Session()
K.set_session(sess)

K._LEARNING_PHASE = tf.constant(0)
K.set_learning_phase(0)

max_features = 5000
max_lens = 500

dict_table = tf.contrib.lookup.HashTable(tf.contrib.lookup.TextFileInitializer("vocab.txt",tf.string, 0, tf.int64, TextFileIndex.LINE_NUMBER, vocab_size=max_features, delimiter=" "), 0)

x_input = tf.placeholder(tf.string, name='x_input', shape=(None,))
sparse_tokenized_input = tf.string_split(x_input)
tokenized_input = tf.sparse_tensor_to_dense(sparse_tokenized_input, default_value='')
token_idxs = dict_table.lookup(tokenized_input)
token_idxs_padded = tf.pad(token_idxs, [[0,0],[0, max_lens]])
token_idxs_embedding = tf.slice(token_idxs_padded, [0,0], [-1, max_lens])

model = Sequential()
model.add(InputLayer(input_tensor=token_idxs_embedding, input_shape=(None, max_lens)))

 ...REST OF MODEL...

model.load_weights("model.h5")

x_info = tf.saved_model.utils.build_tensor_info(x_input)
y_info = tf.saved_model.utils.build_tensor_info(model.output)

prediction_signature = tf.saved_model.signature_def_utils.build_signature_def(inputs={"text": x_info}, outputs={"prediction":y_info}, method_name=tf.saved_model.signature_constants.PREDICT_METHOD_NAME)

builder = saved_model_builder.SavedModelBuilder("/path/to/model")

legacy_init_op = tf.group(tf.tables_initializer(), name='legacy_init_op')

init_op = tf.group(tf.global_variables_initializer(), tf.local_variables_initializer())
sess.run(init_op)


# Add the meta_graph and the variables to the builder
builder.add_meta_graph_and_variables(
  sess, [tag_constants.SERVING],
  signature_def_map={
       signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY:
           prediction_signature,
  },
  legacy_init_op=legacy_init_op)

builder.save()  

UPDATE Doing pre-processing for inference with Tensorflow is a CPU op, and is not carried out efficiently if the model is deployed on a GPU server. The GPU stalls really bad, and the throughput is very low. Therefore, we ditched this for efficient pre-processing in the client process, instead.

Daniel Nitzan
  • 1,582
  • 3
  • 19
  • 36
  • How did you call model.fit() here? Was the input just a list of strings? Where do I put my raw strings to be processed? – chattrat423 Jan 09 '19 at 18:47
  • @chattrat423 This is a script to export the model for inference, so that Tensorflow can do the pre-processing instead of the client. For training, you can use a Keras only version, without the Tensorflow pre-processing code (you can do pre-processing in any python library instead). – Daniel Nitzan Jan 11 '19 at 17:54
  • Thank you, I really appreciate that clarity. Do you happen to have an example for post processing as well where we take predicted label indices and convert them to their string label forms? I am trying to create this for tensorflow serving with my Keras model – chattrat423 Jan 13 '19 at 18:06
  • @chattrat423 I'm not sure I understand your question. When you export your model as a SavedModel, you're supposed to define the output signature using `tf.saved_model.signature_def_utils.build_signature_def` as shown in the code. This is model-specific and should fit the output layer of your model. The client should extract the prediction(s) from the TF Serving call and parse them. If you reached a point where your SavedModel is successfully saved, then write a small python script to call it and print the response, and it will show you the data structure that you need to parse. – Daniel Nitzan Jan 14 '19 at 03:33
  • My current TF model takes the indices of the top k predictions and converts those indices to text labels; label_lookup = tf.contrib.lookup.index_to_string_table_from_tensor(mapping=tf.constant(TARGETS)) label_names = label_lookup.lookup(tf.cast(indices, tf.int64)) ## need to cast into type tf.int64 How do i do that in your example and save it in the graph for TF serving? – chattrat423 Jan 16 '19 at 00:30
  • 1
    @chattrat423 I haven't done anything similar to this (i.e. "post-processing" with Tensorflow). As mentioned, I figured out that it's best to do all pre/post processing in the client process, rather than in Tensorflow. – Daniel Nitzan Jan 17 '19 at 16:05