Is there a faster way to update sagemaker endpoint when running in script mode using inference.py

Question

I am iterating on a model inference deployment using script mode in Sagemaker (currently running in local mode) and update my inference script often. Every time I update the inference.py entry point script, I need to recreate the model instance again like this:

model_instance = PyTorchModel(

    model_data=model_tar_path,
    role=role,
    source_dir="code",
    entry_point="inference.py",
    framework_version="1.8",
    py_version="py3"
)

and then call

predictor = model_instance.deploy(
    initial_instance_count=1,
    endpoint_name='some_name',
    instance_type=instance_type,
    serializer=JSONSerializer(),
    deserializer=JSONDeserializer())

over and over every time I change something. This take super long each time because it basically restarts a new docker container (remember running locally) and waits to install all dependencies before I can start to do something with it. And if at all, there is any error, I need to do the whole thing all over again.

I'd like to explore any possible way I can utilize the update_endpoint functionality that allows me to essentially redeploy the endpoint within the same container, without having to recreate a new container every time and then waiting for all dependency installations etc.

score 1 · Accepted Answer · answered Oct 26 '22 at 04:44

SageMaker local mode is designed to imitate the hosted environment. As such, every time you deploy/update a new container is run.

For faster development, I usually bake all the packages I can into the container which stops the need for installation on every deploy.

I.e. You can extend the SageMaker PyTorch container and bake your packages into it instead of using a requirements.txt. You can then push the image to ECR and specify it in the PyTorchModel

https://docs.aws.amazon.com/sagemaker/latest/dg/prebuilt-containers-extend.html

In your PyTorchModel:

model_instance = PyTorchModel(
    image_uri = <YourImageECRURI>,
    model_data=model_tar_path,
    role=role,
    source_dir="code",
    entry_point="inference.py",
    framework_version="1.8",
    py_version="py3"
)

Thanks for sharing this. Really helpful. As I am new to docker, can I do this docker image extend from my local machine running sagemaker sdk? Or would I need to enable this through studio? — andyk, Oct 26 '22 at 21:50
You can do this locally as long as you have docker installed. — Marc Karp, Oct 27 '22 at 22:12

Is there a faster way to update sagemaker endpoint when running in script mode using inference.py

1 Answers1