I am building an automatic translator in moses. To improve its performance, I use log-linear weight optimisation. This technique has a random component, which can affect slightly the final result (but I do not know exactly how much).
Suppose that the current performance of the model is 25 BLEU.
Suppose now I modify the language model (e.g. change the smoothing), and I get a performance of 26 BLEU.
My question is: how can I know if the improvement is because the modification, or is just noise from the random component?