How to use multiple inputs for custom Tensorflow model hosted by AWS Sagemaker

Question

I have a trained Tensorflow model that uses two inputs to make predictions. I have successfully set up and deployed the model on AWS Sagemaker.

from sagemaker.tensorflow.model import TensorFlowModel
sagemaker_model = TensorFlowModel(model_data='s3://' + sagemaker_session.default_bucket() 
                              + '/R2-model/R2-model.tar.gz',
                             role = role,
                             framework_version = '1.12',
                             py_version='py2',
                             entry_point='train.py')

predictor = sagemaker_model.deploy(initial_instance_count=1,
                              instance_type='ml.m4.xlarge')

predictor.predict([data_scaled_1.to_csv(),
                   data_scaled_2.to_csv()]
                 )

I always receive an error. I could use an AWS Lambda function, but I don't see any documentation on specifying multiple inputs to deployed models. Does anyone know how to do this?

score 3 · Answer 1 · answered Oct 14 '19 at 12:20

You need to actually build a correct signature when deploying the model first. Also, you need to deploy with tensorflow serving.

At inference, you need to also give a proper input format when requesting: basically sagemaker docker server takes the request input and passes it by to tensorflow serving. So, the input needs to match TF serving inputs.

Here is a simple example of deploying a Keras multi-input multi-output model in Tensorflow serving using Sagemaker and how to make inference afterwards:

import tarfile

from tensorflow.python.saved_model import builder
from tensorflow.python.saved_model.signature_def_utils import predict_signature_def
from tensorflow.python.saved_model import tag_constants
from keras import backend as K
import sagemaker
#nano ~/.aws/config
#get_ipython().system('nano ~/.aws/config')
from sagemaker import get_execution_role
from sagemaker.tensorflow.serving import Model


def serialize_to_tf_and_dump(model, export_path):
    """
    serialize a Keras model to TF model
    :param model: compiled Keras model
    :param export_path: str, The export path contains the name and the version of the model
    :return:
    """
    # Build the Protocol Buffer SavedModel at 'export_path'
    save_model_builder = builder.SavedModelBuilder(export_path)
    # Create prediction signature to be used by TensorFlow Serving Predict API
    signature = predict_signature_def(
        inputs={
            "input_type_1": model.input[0],
            "input_type_2": model.input[1],
        },
        outputs={
            "decision_output_1": model.output[0],
            "decision_output_2": model.output[1],
            "decision_output_3": model.output[2]
        }
    )
    with K.get_session() as sess:
        # Save the meta graph and variables
        save_model_builder.add_meta_graph_and_variables(
            sess=sess, tags=[tag_constants.SERVING], signature_def_map={"serving_default": signature})
        save_model_builder.save()

# instanciate model
model = .... 

# convert to tf model
serialize_to_tf_and_dump(model, 'model_folder/1')

# tar tf model
with tarfile.open('model.tar.gz', mode='w:gz') as archive:
    archive.add('model_folder', recursive=True)

# upload it to s3
sagemaker_session = sagemaker.Session()
inputs = sagemaker_session.upload_data(path='model.tar.gz')

# convert to sagemaker model
role = get_execution_role()
sagemaker_model = Model(model_data = inputs,
    name='DummyModel',
    role = role,
    framework_version = '1.12')

predictor = sagemaker_model.deploy(initial_instance_count=1,
    instance_type='ml.t2.medium', endpoint_name='MultiInputMultiOutputModel')

At inference, here is how to request for predictions:

import json
import boto3

x_inputs = ... # list with 2 np arrays of size (batch_size, ...)
data={
    'inputs':{
        "input_type_1": x[0].tolist(),
        "input_type_2": x[1].tolist()
        }
}

endpoint_name = 'MultiInputMultiOutputModel'
client = boto3.client('runtime.sagemaker')
response = client.invoke_endpoint(EndpointName=endpoint_name, Body=json.dumps(data), ContentType='application/json')
predictions = json.loads(response['Body'].read())

This is the correct solution to my problem. I implemented Question Answering model using BERT pretrained model and could not find a way to pass a list of 2 inputs (for 1 example there are 2 inputs: a question and an answer; the label is 0/1) to the endpoint. Creating the signature when saving the model as instructed above and passing the json data format when invoking the endpoint did the job. — AmyN, May 09 '22 at 08:02

score 0 · Answer 2 · answered Jun 06 '19 at 19:17

0

You likely need to customize the inference functions loaded in the endpoints. In the SageMaker TF SDK doc here you can find that there are two options for SageMaker TensorFlow deployment:

Python Endpoint, that is the default, check if modifying the input_fn can accomodate your inference scheme
TF Serving endpoint

You can diagnose error in Cloudwatch (accessible through the sagemaker endpoint UI), choose the most appropriate serving architecture among the above-mentioned two and customize the inference functions if need be

answered Jun 06 '19 at 19:17

Olivier Cruchant

3,747
15
18

I got as far as modifying the input_fn to split a serialized csv into two inputs, but nowhere in the docs does it say how to send multiple inputs to the model. – JHall651 Jun 06 '19 at 23:18
That's because the doc is model agnostic. How would you do this out of SageMaker? what your inference call would look like? – Olivier Cruchant Jun 07 '19 at 07:50

score 0 · Answer 3 · answered Jun 28 '19 at 20:11

Only the TF serving endpoint supports multiple inputs in one inference request. You can follow the documentation here to deploy a TFS endpoint - https://github.com/aws/sagemaker-python-sdk/blob/master/src/sagemaker/tensorflow/deploying_tensorflow_serving.rst

How to use multiple inputs for custom Tensorflow model hosted by AWS Sagemaker

3 Answers3

Linked