1

ROUGE score metric is not working for Arabic evaluation, what should I do?

!pip install rouge_score
from datasets import load_metric
metric= load_metric("rouge")

pred_str =['السلام عليكم كيف حالك']
label_str=['السلام عليكم صديقي كيف حالك']

metric.add_batch(predictions=pred_str, references=label_str)
metric.compute()

output

{‘rouge1’: AggregateScore(low=Score(precision=0.0, recall=0.0, fmeasure=0.0), mid=Score(precision=0.0, recall=0.0, fmeasure=0.0), high=Score(precision=0.0, recall=0.0, fmeasure=0.0)),
‘rouge2’: AggregateScore(low=Score(precision=0.0, recall=0.0, fmeasure=0.0), mid=Score(precision=0.0, recall=0.0, fmeasure=0.0), high=Score(precision=0.0, recall=0.0, fmeasure=0.0)),
‘rougeL’: AggregateScore(low=Score(precision=0.0, recall=0.0, fmeasure=0.0), mid=Score(precision=0.0, recall=0.0, fmeasure=0.0), high=Score(precision=0.0, recall=0.0, fmeasure=0.0)),
‘rougeLsum’: AggregateScore(low=Score(precision=0.0, recall=0.0, fmeasure=0.0), mid=Score(precision=0.0, recall=0.0, fmeasure=0.0), high=Score(precision=0.0, recall=0.0, fmeasure=0.0))}

0 Answers0