Highest Voted 'bleu' Questions

2

votes

1 answer

Why Sacrebleu returns zero BLEU score for short sentences?

Why scarebleu needs that sentences ends with dot? If I remove dots, the value is zero. import sacrebleu, nltk sys = ["This is cat."] refs = [["This is a cat."], ["This is a bad cat."]] b3 = sacrebleu.corpus_bleu(sys, refs) print("b3",…

nltk bleu

asked Mar 26 '21 at 17:56

Ahmad

8,811
11
76
141

2

votes

1 answer

Average of BLEU scores on two subsets of data is not the same as overall score

For evaluating a sequence generation model, I'm using BLEU1:BLEU4. I separated the test set to two sets and calculated the scores on each set separately, as well as, on the whole test set. Surprisingly, the results I get from the whole test set is…

metrics evaluation seq2seq bleu

asked Aug 22 '20 at 01:36

forough

47
1
1
5

2

votes

1 answer

BLEU - Error N-gram overlaps of lower order

I ran the code below a = ['dog', 'in', 'plants', 'crouches', 'to', 'look', 'at', 'camera'] b = ['a', 'brown', 'dog', 'in', 'the', 'grass', ' ', ' '] from nltk.translate.bleu_score import corpus_bleu bleu1 = corpus_bleu(a, b, weights=(1.0, 0, 0,…

bleu

asked Jun 12 '20 at 04:09

Dung Dao

51
1
4

1

vote

0 answers

Rouge Score averaged across documents or per question

I am evaluating a model on NarritiveQA story task, and the metrics that were given are Rouge, BLEU-1/4, METEOR ?. What is the standard practice for evaluating on datasets? Do I average the rouge score across the documents or per question? evaluator…

python pytorch nlp huggingface-datasets bleu

asked Apr 30 '23 at 02:27

parth sarthi

21
4

1

vote

1 answer

Early stopping based on BLEU in FairSeq

My goal is to use BLEU as early stopping metric while training a translation model in FairSeq. Following the documentation, I am adding the following arguments to my training script: --eval-bleu --eval-bleu-args --eval-bleu-detok…

deep-learning nlp machine-translation bleu fairseq

asked Jan 05 '23 at 03:40

Nikhil Jaiswal

13
3

1

vote

0 answers

Why would combining two features vectors degrade machine learning efficiency?

I work on video captioning project and I have two features vectors extracted for each video as numpy files. One of them is with shape of (15,2048) which is extracted by 3D CNN, and the other is with shape of (15,1536) which is extracted by…

python conv-neural-network feature-extraction bleu

asked Oct 29 '22 at 14:10

A_B_Y

332
2
8

1

vote

0 answers

What is the differences between pycocoevalcap Bleu and NLTK Bleu methods for bleu score calculation

For image captioning, Bleu score can by calculated by using pycocoevalcap repo or NLTK Python library like: `from utils.coco_caption.pycocoevalcap.bleu.bleu import Bleu # // by pycocoevalcap . . . from nltk.translate.bleu_score import…

python nltk bleu

asked Oct 26 '22 at 04:55

ENG_AI

23
3

1

vote

1 answer

What are the differences between BLEU score and METEOR?

I am trying to understand the concept of evaluating the machine translation evaluation scores. I understand how what BLEU score is trying to achieve. It looks into different n-grams like BLEU-1,BLEU-2, BLEU-3, BLEU-4 and try to match with the human…

machine-translation bleu opennmt nmt

asked Apr 22 '22 at 03:25

Exploring

2,493
11
56
97

1

vote

1 answer

why the bleu score is zero for this pair even though they are similar

why I'm getting 0 score even though sentences are similar. Please refer the code below from nltk.translate.bleu_score import sentence_bleu score = sentence_bleu(['where', 'are', 'economic', 'networks'], ['where', 'are', 'the', 'economic',…

deep-learning nlp bleu

asked Apr 21 '22 at 20:34

Hagout Fan

13
3

1

vote

1 answer

I compare two identical sentences with BLEU NLTK and don't get 1.0. Why?

I’m trying to use the BLEU score from NLTK for quality evaluation of the machine translation. I wanted to check this code with two identical sentences, here I’m using method1 as a Smoothing function because I’m comparing two sentences and not…

windows-7 nltk python-3.9 machine-translation bleu

asked Aug 25 '21 at 16:22

slow_war

25
5

1

vote

1 answer

NLTK sentence_bleu method 7 gives scores above 1

When using the NLTK sentence_bleu function in combination with SmoothingFunction method 7, the max score is 1.1167470964180197. This while the BLEU score is defined to be between 0 and 1. This score shows up for perfect matches with the reference.…

nltk bleu

asked Jun 15 '19 at 12:37

Rink Stiekema

394
3
13

1

vote

0 answers

Construct heatmap matrix with personal metric python

I want to construct a heatmap matrix but with a customized metric, which is the bleu score. I have 20 sentences that I want to compare this way. I tried to use sns.heatmap or sns.clustermap and then to add sentence_bleu as a metric function, but…

python matrix metrics bleu

asked Apr 30 '19 at 06:31

ASUCB

51
6

1

vote

1 answer

Running NLTK sentence_bleu in Pandas

I am trying to apply sentence_bleu to a column in Pandas to rate the quality of some machine translation. But the scores it is outputting are incorrect. Can anyone see my error? import pandas as pd from nltk.translate.bleu_score import…

pandas nltk bleu

asked Nov 27 '18 at 23:26

Tim Jackson

13
4

1

vote

0 answers

Corpus bleu score calculation

I am trying to calculate bleu score text summarization. Before that I need to know how corpus bleu calculates score between given references and a candidate. I have 3 references and 2 candidates where I am calling corpus bleu on first two references…

math calculation machine-translation bleu

asked Oct 12 '18 at 07:36

Nikita Ramesh Rao

15
8

0

votes

0 answers

keras_nlp.metrics.Bleu ValueError: y_pred must be of rank 0, 1 or 2. Found rank: 3

I am currently engaged in the process of fine-tuning the google/mT5-small model on Google Colab for a translation task. To assess the quality of translations, I am utilizing the keras_nlp.metrics.Bleu metric to compute the Bleu score. However, I…

keras nlp machine-translation bleu

asked Aug 15 '23 at 17:38

王冠侖

1
1

Questions tagged [bleu]