0

I am using a huggingface model alongside a custom pipeline to deploy my model onto SageMaker, my model.tar.gz structure looks like below:

├── added_tokens.json
├── code
│   ├── inference.py
│   ├── pipeline.py
│   └── requirements.txt
├── config.json
├── generation_config.json
├── model-00001-of-00002.safetensors
├── model-00002-of-00002.safetensors
├── model.safetensors.index.json
├── special_tokens_map.json
├── tokenizer_config.json
├── tokenizer.json
└── tokenizer.model

I deployed my model via

from sagemaker.huggingface.model import HuggingFaceModel

hub = {
   'HF_TASK':'text-generation'
}
huggingface_model = HuggingFaceModel(
   env=hub, 
   model_data="s3://my_model_bucket/model.tar.gz",
   role=role,
   transformers_version="4.28",
   pytorch_version="2.0",
   py_version='py310',
)

# deploy the endpoint endpoint
predictor = huggingface_model.deploy(
    initial_instance_count=1,
    instance_type="ml.g5.xlarge"
    )

However, when I try to invoke the model, here is my response

{
  "code": 400,
  "type": "InternalServerException",
  "message": "/opt/ml/model does not appear to have a file named config.json. Checkout \u0027https://huggingface.co//opt/ml/model/None\u0027 for available files."
}

Another error is W-model-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - OSError: /opt/ml/model does not appear to have a file named config.json. Checkout 'https://huggingface.co//opt/ml/model/None' for available files.

But config.json is clearly in my model directory. Here is my inference.py code

import torch
from typing import Dict
from transformers import AutoTokenizer, AutoModelForCausalLM
from pipeline import MyCustomPipeline

pipeline = None

def model_fn(model_dir):
    print("Loading model from: " + model_dir)
    tokenizer = AutoTokenizer.from_pretrained(
        model_dir,
        local_files_only=True,
    )
    model = AutoModelForCausalLM.from_pretrained(
        model_dir,
        local_files_only=True,
        device_map="auto",
        torch_dtype=torch.bfloat16,
        trust_remote_code=True,
    )
    pipeline = MyCustomPipeline(model, tokenizer)
    return model, tokenizer

def transform_fn(model, input_data, content_type, accept):
    return pipeline(input_data)

What am I doing wrong here? I should have followed all needed steps to deploy a huggingface model onto SageMaker.

Baiqing
  • 1,223
  • 2
  • 9
  • 21

1 Answers1

0

I encountered the same error two times.

  1. Apparently, the path to S3 that I mentioned was just wrong.
  2. When I created the tar.gz file, it was stored in a wrong directory rather than how it was supposed to be.

Double check the tar file content and see if it is in the format you initially showed.

!tar -xvf "model.tar.gz"