I would like to host a model on Sagemaker using the new Serverless Inference.
I wrote my own container for inference and handler following several guides. These are the requirements:
mxnet
multi-model-server
sagemaker-inference
retrying
nltk
transformers==4.12.4
torch==1.10.0
On non-serverless endpoints, this container works perfectly well. However, with the serverless version I get the following error message when loading the model:
ERROR - /.sagemaker/mms/models/model already exists.
The error is thrown by the following subprocess
['model-archiver', '--model-name', 'model', '--handler', '/home/model-server/handler_service.py:handle', '--model-path', '/opt/ml/model', '--export-path', '/.sagemaker/mms/models', '--archive-format', 'no-archive']
So something that has to do with the model-archiver
(which I guess is a process from the MMS package?).