1

I have a trained Tensorflow model that uses two inputs to make predictions. I have successfully set up and deployed the model on AWS Sagemaker.

from sagemaker.tensorflow.model import TensorFlowModel
sagemaker_model = TensorFlowModel(model_data='s3://' + sagemaker_session.default_bucket() 
                              + '/R2-model/R2-model.tar.gz',
                             role = role,
                             framework_version = '1.12',
                             py_version='py2',
                             entry_point='train.py')

predictor = sagemaker_model.deploy(initial_instance_count=1,
                              instance_type='ml.m4.xlarge')

predictor.predict([data_scaled_1.to_csv(),
                   data_scaled_2.to_csv()]
                 )

I always receive an error. I could use an AWS Lambda function, but I don't see any documentation on specifying multiple inputs to deployed models. Does anyone know how to do this?

JHall651
  • 427
  • 1
  • 4
  • 15

3 Answers3

3

You need to actually build a correct signature when deploying the model first. Also, you need to deploy with tensorflow serving.

At inference, you need to also give a proper input format when requesting: basically sagemaker docker server takes the request input and passes it by to tensorflow serving. So, the input needs to match TF serving inputs.

Here is a simple example of deploying a Keras multi-input multi-output model in Tensorflow serving using Sagemaker and how to make inference afterwards:

import tarfile

from tensorflow.python.saved_model import builder
from tensorflow.python.saved_model.signature_def_utils import predict_signature_def
from tensorflow.python.saved_model import tag_constants
from keras import backend as K
import sagemaker
#nano ~/.aws/config
#get_ipython().system('nano ~/.aws/config')
from sagemaker import get_execution_role
from sagemaker.tensorflow.serving import Model


def serialize_to_tf_and_dump(model, export_path):
    """
    serialize a Keras model to TF model
    :param model: compiled Keras model
    :param export_path: str, The export path contains the name and the version of the model
    :return:
    """
    # Build the Protocol Buffer SavedModel at 'export_path'
    save_model_builder = builder.SavedModelBuilder(export_path)
    # Create prediction signature to be used by TensorFlow Serving Predict API
    signature = predict_signature_def(
        inputs={
            "input_type_1": model.input[0],
            "input_type_2": model.input[1],
        },
        outputs={
            "decision_output_1": model.output[0],
            "decision_output_2": model.output[1],
            "decision_output_3": model.output[2]
        }
    )
    with K.get_session() as sess:
        # Save the meta graph and variables
        save_model_builder.add_meta_graph_and_variables(
            sess=sess, tags=[tag_constants.SERVING], signature_def_map={"serving_default": signature})
        save_model_builder.save()

# instanciate model
model = .... 

# convert to tf model
serialize_to_tf_and_dump(model, 'model_folder/1')

# tar tf model
with tarfile.open('model.tar.gz', mode='w:gz') as archive:
    archive.add('model_folder', recursive=True)

# upload it to s3
sagemaker_session = sagemaker.Session()
inputs = sagemaker_session.upload_data(path='model.tar.gz')

# convert to sagemaker model
role = get_execution_role()
sagemaker_model = Model(model_data = inputs,
    name='DummyModel',
    role = role,
    framework_version = '1.12')

predictor = sagemaker_model.deploy(initial_instance_count=1,
    instance_type='ml.t2.medium', endpoint_name='MultiInputMultiOutputModel')

At inference, here is how to request for predictions:

import json
import boto3

x_inputs = ... # list with 2 np arrays of size (batch_size, ...)
data={
    'inputs':{
        "input_type_1": x[0].tolist(),
        "input_type_2": x[1].tolist()
        }
}

endpoint_name = 'MultiInputMultiOutputModel'
client = boto3.client('runtime.sagemaker')
response = client.invoke_endpoint(EndpointName=endpoint_name, Body=json.dumps(data), ContentType='application/json')
predictions = json.loads(response['Body'].read())
  • This is the correct solution to my problem. I implemented Question Answering model using BERT pretrained model and could not find a way to pass a list of 2 inputs (for 1 example there are 2 inputs: a question and an answer; the label is 0/1) to the endpoint. Creating the signature when saving the model as instructed above and passing the json data format when invoking the endpoint did the job. – AmyN May 09 '22 at 08:02
0

You likely need to customize the inference functions loaded in the endpoints. In the SageMaker TF SDK doc here you can find that there are two options for SageMaker TensorFlow deployment:

You can diagnose error in Cloudwatch (accessible through the sagemaker endpoint UI), choose the most appropriate serving architecture among the above-mentioned two and customize the inference functions if need be

Olivier Cruchant
  • 3,747
  • 15
  • 18
  • I got as far as modifying the input_fn to split a serialized csv into two inputs, but nowhere in the docs does it say how to send multiple inputs to the model. – JHall651 Jun 06 '19 at 23:18
  • That's because the doc is model agnostic. How would you do this out of SageMaker? what your inference call would look like? – Olivier Cruchant Jun 07 '19 at 07:50
0

Only the TF serving endpoint supports multiple inputs in one inference request. You can follow the documentation here to deploy a TFS endpoint - https://github.com/aws/sagemaker-python-sdk/blob/master/src/sagemaker/tensorflow/deploying_tensorflow_serving.rst

Rui
  • 61
  • 1