I have two 1-dimensional PyTorch tensors (of type bfloat16
), and I want to compute their inner/dot product.
Should I use torch.dot
or torch.inner
? I thought it shouldn't really matter which one, but they give me wildly different results. I experimented with some other methods too, and found that some behave like torch.dot
and some behave like torch.inner
(see comments in code below). Why is this? And which is the right one to use?
import torch
torch.manual_seed(17) # set seed for replication
dtype=torch.bfloat16
# make two 1d tensors
a = torch.rand([10000], dtype=dtype)
b = torch.rand([10000], dtype=dtype)
x1 = torch.dot(a,b)
# or, equivalently:
# x1 = torch.matmul(a,b)
# x1 = a @ b
x2 = torch.inner(a,b)
# or equivalently:
# x2 = (a*b).sum(-1)
# x2 = torch.mul(a,b).sum(-1)
# x2 = torch.matmul(a.unsqueeze(0),
# b.unsqueeze(-1)).squeeze()
The results are not equal. They're not even close.
print(f"""{x1 = }\n{x2 = }
{torch.equal(x1, x2) = }
{torch.isclose(x1, x2) = }""")
x1 = tensor(256., dtype=torch.bfloat16) x2 = tensor(2464., dtype=torch.bfloat16) torch.equal(x1, x2) = False torch.isclose(x1, x2) = tensor(False)
However, if I set dtype=torch.float
instead of bfloat16
, they end up nearly the same (still some differences due, I suppose, to numerical instability).
x1 = tensor(2477.7292, dtype=torch.bfloat16) x2 = tensor(2477.7295, dtype=torch.bfloat16) torch.equal(x1, x2) = False torch.isclose(x1, x2) = tensor(True)
What is the best way to get the inner product reliably, if the type is/may be bfloat16?
EDIT:
Python 3.10.10 on CPU.