I am iterating on a model inference deployment using script mode in Sagemaker (currently running in local mode) and update my inference script often. Every time I update the inference.py
entry point script, I need to recreate the model instance again like this:
model_instance = PyTorchModel(
model_data=model_tar_path,
role=role,
source_dir="code",
entry_point="inference.py",
framework_version="1.8",
py_version="py3"
)
and then call
predictor = model_instance.deploy(
initial_instance_count=1,
endpoint_name='some_name',
instance_type=instance_type,
serializer=JSONSerializer(),
deserializer=JSONDeserializer())
over and over every time I change something. This take super long each time because it basically restarts a new docker container (remember running locally) and waits to install all dependencies before I can start to do something with it. And if at all, there is any error, I need to do the whole thing all over again.
I'd like to explore any possible way I can utilize the update_endpoint
functionality that allows me to essentially redeploy the endpoint within the same container, without having to recreate a new container every time and then waiting for all dependency installations etc.