0

From https://huggingface.co/Unbabel/unite-mup, there's a model that comes from the UniTE: Unified Translation Evaluation paper. The usage was documented as such:

from comet import download_model, load_from_checkpoint

model_path = download_model("Unbabel/unite-mup")
model = load_from_checkpoint(model_path)
data = [
    {
        "src": "这是个句子。",
        "mt": "This is a sentence.",
        "ref": "It is a sentence."
    },
    {
        "src": "这是另一个句子。",
        "mt": "This is another sentence.",
        "ref": "It is another sentence."
    }
]
model_output = model.predict(data, batch_size=8, gpus=1)

Similar to How to load Unbabel Comet model without nested wrapper initialization?, there's a load_from_checkpoint wrapper around the model and the actual class object that makes use of the model. Also, there's no clear instruction of how to use a locally saved Unbabel/unite-mup model.

Is there some way to use locally saved United MUP model in Unbabel-Comet model for Machine Translation Evaluation?

alvas
  • 115,346
  • 109
  • 446
  • 738

1 Answers1

0

First ensure that you have the required unbabel-comet version to support the model,

pip install unbabel-comet>=2.0.1

Then

import os
from huggingface_hub import snapshot_download

from comet.models.multitask.unified_metric import UnifiedMetric

model_path = snapshot_download(repo_id="Unbabel/unite-mup", cache_dir=os.path.abspath(os.path.dirname('.')))
model_checkpoint_path = f"{model_path}/checkpoints/model.ckpt"

unite = UnifiedMetric.load_from_checkpoint(model_checkpoint_path)

Then the same usage code as documented on https://huggingface.co/Unbabel/unite-mup works:

data = [
    {
        "src": "这是个句子。",
        "mt": "This is a sentence.",
        "ref": "It is a sentence."
    },
    {
        "src": "这是另一个句子。",
        "mt": "This is another sentence.",
        "ref": "It is another sentence."
    }
]
model_output = unite.predict(data, batch_size=8, gpus=1)

# Expected SRC score:
# [0.3474583327770233, 0.4492775797843933]
print (model_output.metadata.src_scores)

# Expected REF score:
# [0.9252626895904541, 0.899452269077301]
print (model_output.metadata.ref_scores)

# Expected UNIFIED score:
# [0.8758717179298401, 0.8294666409492493]
print (model_output.metadata.unified_scores)

Working example: https://colab.research.google.com/drive/1ggj_sC4dzwpjaOjv07GZq9t0bSnhAvFi?usp=sharing

alvas
  • 115,346
  • 109
  • 446
  • 738