Why does the result of torch.nn.functional.mse_loss(x1,x2)
result differ from the direct computation of the MSE?
My test code to reproduce:
import torch
import numpy as np
# Think of x1 as predicted 2D coordinates and x2 of ground truth
x1 = torch.rand(10,2)
x2 = torch.rand(10,2)
mse_torch = torch.nn.functional.mse_loss(x1,x2)
print(mse_torch) # 0.1557
mse_direct = torch.nn.functional.pairwise_distance(x1,x2).square().mean()
print(mse_direct) # 0.3314
mse_manual = 0
for i in range(len(x1)) :
mse_manual += np.square(np.linalg.norm(x1[i]-x2[i])) / len(x1)
print(mse_manual) # 0.3314
As we can see, the result from torch's mse_loss
is 0.1557
, differing from the manual MSE computation which yields 0.3314
.
In fact, the result from mse_loss
is precisely as big as the direct result times the dimension of the points (here 2).
What's up with that?