I am trying to set up a multi-model endpoint (or more accurately re-set it up as I am pretty sure it was working a while ago, on an earlier version of sagemaker) to do language translation. But am constantly met with the same issue. This is what I am trying to run (from a notebook on sagemaker):
import sagemaker
from sagemaker.pytorch.model import PyTorchModel
from sagemaker.predictor import JSONSerializer, JSONDeserializer
role = 'role_name...'
pytorch_model = PyTorchModel(model_data='s3://foreign-language-models/opus-mt-ROMANCE-en.tar.gz',
role=role,
framework_version="1.3.1",
py_version="py3",
source_dir="code",
entry_point="deploy_multi_model.py")
x = pytorch_model.predictor_cls(endpoint_name='language-translation')
x.serializer = JSONSerializer()
x.deserializer = JSONDeserializer()
x.predict({'model_name': 'opus-mt-ROMANCE-en', 'text': ["Hola que tal?"]})
To which I am met with the error:
ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received server error (500) from model with message "{
"code": 500,
"type": "InternalServerException",
"message": "Worker died."
}
And when I investigate the the logs the error links to the only notable one says:
epollEventLoopGroup-4-1 com.amazonaws.ml.mms.wlm.WorkerThread - 9000 Worker disconnected. WORKER_MODEL_LOADED
But I cannot figure out why this is happening. Any help would be greatly appreciated as this is currently driving me insane! And if you need any more information from me to help, don't hesitate to ask.