How does the MSE loss function work in pytorch?

Asked Dec 04 '18 at 13:49

Active Dec 04 '18 at 13:49

Viewed 2,445 times

In the following code (extracted from SentEval), a neural network structure is defined which maps 1024 real numbers to 5 output predictions. The problem is to assess the relatedness between two sentences (each represented with 512 features). The relatedness is a number in [1,5]. I think if the training relatedness numbers were in {1,2,3,4,5}, the cross entropy was a better loss function, but since in the training set we have real relatedness numbers in [1,5], the MSE is used as the loss function.

Question: Since for each input, the network outputs 5 probability numbers, how the MSE is calculated between a real number and 5 probability numbers?

from torch import nn

inputdim = 1024
nclasses = 5
model = nn.Sequential(
            nn.Linear(inputdim, nclasses),
            nn.Softmax(dim=-1),
        )
loss_fn = nn.MSELoss()

asked Dec 04 '18 at 13:49

Hossein

2,041
1
16
29

If its a regression problem not a classification problem you should use the mse(mean squared error) as a loss function to infer real values between 1 and 5 in your case – Mehdi Bahra Dec 04 '18 at 15:11
@MehdiBahra Yes, but why a softmax layer with 5 neurons is added? – Hossein Dec 04 '18 at 17:11
i don't know exactly , it's not common , but the probabilistic interpretation of mse can help you to understand that http://rohanvarma.me/Loss-Functions/ – Mehdi Bahra Dec 04 '18 at 20:44
@MehdiBahra OK, thanks. – Hossein Dec 04 '18 at 20:46

How does the MSE loss function work in pytorch?

0 Answers0