0

i'm trying to deploy a model in a sagemaker endpoint using a custom docker file:

ARG REGION=us-east-1

FROM 763104351884.dkr.ecr.$REGION.amazonaws.com/pytorch-inference:2.0.1-gpu-py310-cu118-ubuntu20.04-sagemaker

RUN pip install poetry

RUN poetry config virtualenvs.create false

WORKDIR /opt/

RUN poetry new code --name models

WORKDIR /opt/code/

RUN poetry add json-lines sagemaker-inference

ADD tuta models/tuta

ENV SAGEMAKER_SUBMIT_DIRECTORY /opt/code

ENV SAGEMAKER_PROGRAM models/tuta/sm_inference.py

the models/tuta contains multiple model file such as layers, metrics... along with the sm_inference.Py file:

from models.tuta.inference import TUTAForCTC
import json
import os

JSON_CONTENT_TYPE = 'application/json'

def model_fn(model_dir):
    print("loading the model!")   
    model = TUTAForCTC(model_bin=os.path.join(model_dir, "tuta-ctc.bin"), model_config_path=os.path.join(model_dir, "config.json"))
    print("model loaded!")   
    return model

def predict_fn(data, model):
    print("predicting...")   
    return {"response": data}
    # return model.predict(data['hier_table'], data['flat_table'], data['table_range'])

def input_fn(serialized_input_data, content_type=JSON_CONTENT_TYPE):
    print("reading input...")   
    return json.loads(serialized_input_data)
    
def output_fn(prediction, content_type):
    return prediction

The endpoint gets deployed and has the status InService, with a 200 response when ping. But once i run send a request, i get an error and the ping response is 500.

Djellal Mohamed Aniss
  • 1,723
  • 11
  • 24

1 Answers1

1

I would suggest running the container locally and testing to see if you can make /invocations call to the container. Also confirm if there is any error in the CloudWatch logs.

To imitate the SageMaker hosted environment you can run the container as follows:

docker run -v $(pwd)/<PathToYourModelArtifacts>:/opt/ml -p 8080:8080 --rm <yourImageURI> serve

To predict:

curl --data-binary @${payload} -H "Content-Type: ${content}" -v http://localhost:8080/invocations

https://github.com/aws/amazon-sagemaker-examples/tree/main/advanced_functionality/scikit_bring_your_own/container/local_test

Marc Karp
  • 949
  • 4
  • 6