Consider the following matrix product between two arrays:
import numpy as np
A = np.random.rand(2,10,10)
B = np.random.rand(2,2)
C = A.T @ B
...goes fine. I think of the above as a 1-by-2 times 2-by-2 vector-matrix product broadcast over the 10-by-10 2nd and 3rd dimensions of A. Inspection of the result C
confirms this intuition; np.allclose(C[i,j], A.T[i,j] @ B)
for all i
, j
.
Now mathematically, I should be able to compute C.T
as well as: B.T @ A
, but:
B.T @ A
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-32-ffdbb14ca160> in <module>
----> 1 B.T @ A
ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 10 is different from 2)
So broadcast-wise, a 10-by-10-by-2 tensor and a 2-by-2 matrix are compatible with respect to matrix product, but a 2-by-2 matrix and 2-by-10-by-10 tensor are not?
Bonus info: I want to be able to compute the "quadratic product" A.T @ B @ A
and it really annoys me to have to write for-loops to manually "broadcast" over one of the dimensions. It feels like it should be possible to do this more elegantly. I am pretty experienced with Python and NumPy, but I rarely go beyond two-dimensional arrays.
What am I missing here? Is there something about the way transpose operates on tensors in NumPy that I do not understand?