Rouge Score averaged across documents or per question

Asked Apr 30 '23 at 02:27

Active Apr 30 '23 at 09:19

Viewed 57 times

I am evaluating a model on NarritiveQA story task, and the metrics that were given are Rouge, BLEU-1/4, METEOR ?. What is the standard practice for evaluating on datasets? Do I average the rouge score across the documents or per question?

evaluator = RougeL(multiref="best", alpha=0.5)
evaluator.update(([predicted_response], [references]))

I'm using this right now and updating after every quesioton, the metric is imported from Pytorch-ignite.

edited Apr 30 '23 at 09:19

user67275

asked Apr 30 '23 at 02:27

parth sarthi

Rouge Score averaged across documents or per question

0 Answers0