Sagemaker multi-model endpoints with unsupported built-in algorithms

Question

I am aware that Sagemaker does not support multi-model endpoints for their built-in image classification algorithm. However, in the documentation they hint at building a custom container to use "any other framework or algorithm" with the multi-model endpoint functionality:

To use any other framework or algorithm, use the SageMaker inference toolkit to build a container that supports multi-model endpoints. For information, see Build Your Own Container with Multi Model Server.

Ideally, I would like to deploy many (20+) image classification models I have already trained to a single endpoint to save on costs. However, after reading the "Build Your Own Container" guide it is still not exactly clear to me how to build a custom inference container for the models produced by a non-custom algorithm. Most of the tutorials and example notebooks refer to using Pytorch or Sklearn. It is not clear to me that I could make inferences using these libraries on the models I've created with the built-in image classification algorithm.

Is it possible to create a container to support multi-model endpoints for unsupported built-in Sagemaker algorithms? If so, would somebody be able to hint at how this might be done?

score 1 · Accepted Answer · answered Feb 03 '21 at 19:01

yes, it is possible to deploy the built in image classification models as a SageMaker multi model endpoint. The key is that the image classification uses Apache MXNet. You can extract the model artifacts (SageMaker stores them in a zip file named model.tar.gz in S3), then load them in to MXNet. The SageMaker MXNet container supports multi model endpoints, so you can use that to deploy the model.

If you unzip the model.tar.gz from this algorithm, you'll find three files:

image-classification-****.params

image-classification-symbol.json

model-shapes.json

The MxNet container expects these files to be named image-classification-0000.params, model-symbol.json, and model-shapes.json. So I unzipped the zip file, renamed the files and rezipped them. For more information on the MXNet container check out the GitHub repository.

After that you can deploy the model as a single MXNet endpoint using the SageMaker SDK with the following code:

from sagemaker import get_execution_role
from sagemaker.mxnet.model import MXNetModel

role = get_execution_role()

mxnet_model = MXNetModel(model_data=s3_model, role=role, 
                         entry_point='built_in_image_classifier.py', 
                         framework_version='1.4.1',
                         py_version='py3')

predictor = mxnet_model.deploy(instance_type='ml.c4.xlarge', initial_instance_count=1)

The entry point Python script can be an empty Python file for now. We will be using the default inference handling provided by the MXNet container.

The default MXNet container only accepts JSON, CSV, and Numpy arrays as valid input. So you will have to format your input in to one of these three formats. The code below demonstrates how I did it with Numpy arrays:

import cv2
import io

np_array = cv2.imread(filename=img_filename)
np_array = np_array.transpose((2,0,1))
np_array = np.expand_dims(np_array, axis=0)

buffer = io.BytesIO()
np.save(buffer, np_array)

response = sm.invoke_endpoint(EndpointName='Your_Endpoint_name', Body=buffer.getvalue(), ContentType='application/x-npy')

Once you have a single endpoint working with MXNet container, you should be able to get it running in multi model endpoint using the SageMaker MultiDataModel constructor.

If you want to use a different input data type so you don't have to do the preprocessing in your application code, you can overwrite the input_fn method in the MxNet container by providing it in the entry_point script. See here for more information. If you do this, you could pass the image bytes directly to SageMaker, without formatting the numpy arrays.

This solution *almost* works. I am able to deploy a single-model MXNetModel and get inferences from that model. However, when I deploy a multi-model MXNetModel, (using the same exact file names, file structure, inference.py script, and configurations) I am getting a 503 when I invoke_endpoint(): An error occurred (ModelError) when calling the InvokeEndpoint operation: Received server error (503) from model with message "{ "code": 503, "type": "InternalServerException", "message": "Prediction failed" } Attempting to find a solution to this... — Misha, Feb 12 '21 at 18:56
Thanks for letting me know. I think this is a bug in the mxnet-inference-toolkit which the SageMaker MXNet container uses. See the issue I opened on [GitHub here](https://github.com/aws/sagemaker-mxnet-inference-toolkit/issues/135). As a workaround, I was able to successfully deploy the model by following the Bring Your Own Container example [here](https://github.com/aws/amazon-sagemaker-examples/tree/master/advanced_functionality/multi_model_bring_your_own). Just treat the model as a MXNet model. You will need to modify the model_handler.py file to suit the model. — Swil, Feb 18 '21 at 01:30
Hi Swil thank you very much for the help above. Can you please explain why the model_handler for BYOM only checks epoch 0? In my setup, the default artifacts that come out of the image-classification image generates multiple params files starting with 0002, 0004, 0006 etc. up to the number of epochs. Shouldn't the model_handler pull in the highest epoch? Link to line here: https://github.com/aws/amazon-sagemaker-examples/blob/master/advanced_functionality/multi_model_bring_your_own/container/model_handler.py#L92 — Daryl Teo, Sep 16 '21 at 04:36

Sagemaker multi-model endpoints with unsupported built-in algorithms

1 Answers1

Linked