There are different kind of solutions:
Option A: You can use TensorFlow Serving to serve your trained and saved model. Then you can preprocess and prepare the data for a request. After that you can use the tensorflow-serving-api package to send a PredictRequest
via gRPC to the previous started service:
import tensorflow as tf
from tensorflow_serving.apis.prediction_service_pb2_grpc import (
PredictionServiceStub
)
from tensorflow_serving.apis.predict_pb2 import (
PredictRequest,
PredictResponse,
)
values: List[float] = # ... YOUR PREPROCESSED DATA ...
host, port = '0.0.0.0', 8500
channel = grpc.insecure_channel(f'{host}:{port}')
stub = PredictionServiceStub(channel)
request = PredictRequest()
request.model_spec.name = 'cifar10'
request.model_spec.signature_name = 'serving_default'
tensor_proto = tf.make_tensor_proto(
values=values,
shape=(1, 32, 32, 3),
dtype=float
)
request.inputs['input'].CopyFrom(tensor_proto)
response: PredictResponse = stub.Predict(request)
result = response.outputs['output'].float_val
Option B: You could try to use LambdaLayers to embed preprocessing steps to a model. I've never tried it out, but you should be careful if you want to use custom imports and dependencies, because if we use the regular load_model method to load a saved model we have to apply the required custom_objects
.