Discrepancies: Sagemaker vs. Local Hugging Face Model inference Results

Question

I am using a Hugging Face model (sentence-transformers/distiluse-base-multilingual-cased-v2) on my local system to convert text into vectors.

class SequenceEncoder(object):
    def __init__(self, device=None):
        self.device = device
        self.multi_model = SentenceTransformer('distiluse-base-multilingual-cased-v2',device=device)   # 512 dimensional dense vector

    @torch.no_grad()
    def __call__(self, col, dialect_list=None):
        if isinstance(col, str):
            vals = [col]
        else:
            vals = col.replace([np.nan, 0], '').values.tolist()
        x = self.multi_model.encode(vals, show_progress_bar=True, convert_to_tensor=True, device=self.device)
        return x.cpu()

The local model output for a given text looks as follows: [-0.00285156 -0.04651115 -0.00723144 -0.04229123 -0.02418377,0.00646215, ...]

Recently, I decided to deploy the same model on Amazon Sagemaker using the following configuration:

HF_MODEL_ID: sentence-transformers/distiluse-base-multilingual-cased-v2

HF_TASK: feature-extraction

After successfully deploying the model and creating an endpoint on Sagemaker, I tested it with the same input text. However, the output from Sagemaker differs from the local model output:

Sagemaker output: [-0.035367466509342194, 0.011641714721918106, -0.04396483674645424, 0.03655952587723732, ...]

I noticed two main discrepancies in the results:

Differences in values: The numerical values in the vectors from Sagemaker differ from the local model's output.
Differences in floating-point length: The vectors from Sagemaker seem to have a different floating-point length compared to the local model's vectors.

Also, I tried removing the "convert_to_tensor=True" parameter when using the local model, but it didn't resolve the discrepancies.

I would appreciate any insights into why these differences are occurring between the local model and the model deployed on Sagemaker. Additionally, I would like to know if there is a way to make Sagemaker return tensors as well.

Thank you in advance for your assistance!

Check (i) Pytorch versions on both Sagemaker and local, next check (ii) transformers version on both Sagemaker and local, last, check (iii) check the device, if it's on CPU or GPU and also what is the GPU and CPU. Add those information to the question to help us help you better. — alvas, Jul 24 '23 at 20:27

Discrepancies: Sagemaker vs. Local Hugging Face Model inference Results

0 Answers0