Cross Entropy Loss Gets Negative Values In Training Transformer Model

Asked Feb 16 '23 at 17:05

Active Feb 16 '23 at 17:05

Viewed 112 times

I built my Transformer model for recovering text. In detail, the source text may contain some redundant, missing or wrong words, my model have to correct as many as possible these words. Moreover, I just want my model learn embedding of the correct sentence, so the sources and the targets are sequences of embedding. Therefore, my loss function - Cross Entropy takes 2 embedding sequence as input and target. In addition, this model is a part of the larger model which the main criterion is Negative-Log Likelihood.

Unfortunately, the values of Cross Entropy Loss is under 0.0 after few epochs, then the sum of Cross Entropy and Negative-Log Likelihood is under 0.0 too. This makes the whole model be not able to converge.

I need helps to resolve this issue. Thank in advance.

asked Feb 16 '23 at 17:05

Huỳnh Nguyễn Hiếu Nghĩa

Cross Entropy Loss Gets Negative Values In Training Transformer Model

0 Answers0