Typically to use a TF graph, it is necessary to convert raw data to numerical values. I refer to this process as a pre-processing step. For example, if the raw data is a sentence, one way is to do this is to tokenize the sentence and map each word to a unique number. This preprocessing creates a sequence of number for each sentence, which will be the input of the model.
We need also to post-process the output of a model to interpret it. For example, converting a sequence of numbers generated by the model to words and then building a sentence.
TF Serving is a new technology that is recently introduced by Google to serve a TF model. My question is that:
Where should pre-processing and post-processing be executed when a TF model is served using TensorFlow serving?
Should I encapsulate pre-processing and post-processing steps in my TF Graph (e.g. using py_fun or map_fn) or there is another TensorFlow technology that I am not aware of.