I would like to use the Helsinki-NLP/opus-mt-de-en
model from HuggingFace to translate text.
This works fine with the HuggingFace Inference API or a Transformers pipeline, e.g.:
from transformers import AutoTokenizer, pipeline
from optimum.onnxruntime import ORTModelForSeq2SeqLM
model = ORTModelForSeq2SeqLM.from_pretrained("Helsinki-NLP/opus-mt-de-en", from_transformers=True)
tokenizer = AutoTokenizer.from_pretrained("Helsinki-NLP/opus-mt-de-en")
onnx_translation = pipeline("translation_de_to_en", model=model, tokenizer=tokenizer)
result = onnx_translation("Dies ist ein Test!")
result # Prints: "[{'translation_text': 'This is a test!'}]"
However, I need to use the ONNX Runtime for this as part of a project. I was able to successfully export the model to ONNX format, but I get the following output when I decode the output of the InferenceSession
:
<unk> <unk> <unk> <unk> <unk>.<unk> <unk> <unk>,<unk>,<unk>,.<unk> <unk>,<unk> <unk> <unk> <unk> <unk> <unk> <unk> <unk> <unk> <unk> <unk>.<unk> <unk> <unk>,<unk> <unk> <unk>. the,<unk>,,.<unk> <unk> <unk>,<unk>, the<unk> <unk> <unk> <unk> <unk>.<unk> <unk> <unk> <unk> <unk>.<unk> <unk> <unk> <unk>.<unk> <unk> <unk> <unk> the.. in<unk> <unk>.<unk> <unk>,<unk> <unk> <unk> the<unk> <unk> <unk> <unk>.<unk> the<unk> <unk> the<unk> <unk> <unk>.<unk> <unk> <unk> <unk> <unk>,<unk> in<unk> the<unk>,<unk>,<unk> <unk> in, the in<unk> <unk> <unk> s<unk>. the.<unk> <unk>, in,<unk> <unk> <unk> <unk> <unk> <unk> <unk> <unk>. a,,<unk> <unk> <unk>.<unk>.<unk> <unk>.<unk> is<unk>,<unk> in,,<unk> <unk> the<unk> the the. in<unk> <unk> <unk>,<unk> <unk> <unk> <unk>. in<unk>,,,<unk>.<unk> <unk>. of<unk> in<unk>.<unk>,<unk> <unk> <unk> the<unk> <unk> <unk> <unk> <unk> die,.<unk>,<unk> <unk> <unk> die,<unk> <unk> <unk> <unk> <unk> <unk> <unk> <unk> the<unk>.<unk> <unk> <unk> <unk> <unk> of.<unk> <unk> <unk>.<unk> in<unk>, the<unk> <unk> <unk> <unk> <unk> <unk> <unk> <unk>.<unk> <unk>,.<unk> <unk>,<unk>,<unk> <unk>,<unk> <unk>,<unk> <unk> <unk>,<unk>,<unk> <unk> <unk>,<unk>.<unk> of<unk>.<unk> of, the<unk> the.<unk> <unk>
This is the relevant code (without the ONNX export):
from transformers import AutoTokenizer
# Load Tokenizer
tokenizer = AutoTokenizer.from_pretrained("Helsinki-NLP/opus-mt-de-en")
# Encode input
encoded_input = tokenizer("Dies ist ein Test!")
# Create model input dictionary
model_input = {
'input_ids': [encoded_input.input_ids],
'attention_mask': [encoded_input.attention_mask],
'decoder_input_ids': [encoded_input.input_ids],
'decoder_attention_mask': [encoded_input.attention_mask]
}
# Run inference
output = session.run(['last_hidden_state'], model_input)
last_hidden_state = output[0][0][0]
# Decode output
decoded_output = tokenizer.decode(last_hidden_state, skip_special_tokens=True)
decoded_output # Expected value: "This is a test!"
The complete code can be found in this Colab Notebook as reproducible example.
I don't have much experience with ONNX and the MarianMT models yet. What am I doing wrong and how can I decode the text correctly?