I am using a Hugging Face model (sentence-transformers/distiluse-base-multilingual-cased-v2) on my local system to convert text into vectors.
class SequenceEncoder(object):
def __init__(self, device=None):
self.device = device
self.multi_model = SentenceTransformer('distiluse-base-multilingual-cased-v2',device=device) # 512 dimensional dense vector
@torch.no_grad()
def __call__(self, col, dialect_list=None):
if isinstance(col, str):
vals = [col]
else:
vals = col.replace([np.nan, 0], '').values.tolist()
x = self.multi_model.encode(vals, show_progress_bar=True, convert_to_tensor=True, device=self.device)
return x.cpu()
The local model output for a given text looks as follows: [-0.00285156 -0.04651115 -0.00723144 -0.04229123 -0.02418377,0.00646215, ...]
Recently, I decided to deploy the same model on Amazon Sagemaker using the following configuration:
HF_MODEL_ID: sentence-transformers/distiluse-base-multilingual-cased-v2
HF_TASK: feature-extraction
After successfully deploying the model and creating an endpoint on Sagemaker, I tested it with the same input text. However, the output from Sagemaker differs from the local model output:
Sagemaker output: [-0.035367466509342194, 0.011641714721918106, -0.04396483674645424, 0.03655952587723732, ...]
I noticed two main discrepancies in the results:
Differences in values: The numerical values in the vectors from Sagemaker differ from the local model's output.
Differences in floating-point length: The vectors from Sagemaker seem to have a different floating-point length compared to the local model's vectors.
Also, I tried removing the "convert_to_tensor=True" parameter when using the local model, but it didn't resolve the discrepancies.
I would appreciate any insights into why these differences are occurring between the local model and the model deployed on Sagemaker. Additionally, I would like to know if there is a way to make Sagemaker return tensors as well.
Thank you in advance for your assistance!