1

In the following code (extracted from SentEval), a neural network structure is defined which maps 1024 real numbers to 5 output predictions. The problem is to assess the relatedness between two sentences (each represented with 512 features). The relatedness is a number in [1,5]. I think if the training relatedness numbers were in {1,2,3,4,5}, the cross entropy was a better loss function, but since in the training set we have real relatedness numbers in [1,5], the MSE is used as the loss function.

Question: Since for each input, the network outputs 5 probability numbers, how the MSE is calculated between a real number and 5 probability numbers?

from torch import nn

inputdim = 1024
nclasses = 5
model = nn.Sequential(
            nn.Linear(inputdim, nclasses),
            nn.Softmax(dim=-1),
        )
loss_fn = nn.MSELoss()
Hossein
  • 2,041
  • 1
  • 16
  • 29

0 Answers0