1

Let's assume that we have two tensors (A and B) with the same number of dimensions. We can multiply them with tensordot. For example:

T.tensordot(A, B, axes = [[0,3], [0,3]])

In this case we "pair" axes of the first tensor with some axes of the second tensor and then we sum over these "paired" axes:

C[j, k, a, b ] = sum_{i, l} A[i, j, k, l] * A[i, a, b, l]

In the above example the first and the last axis of the first tensor is paired with the first and the last axis of the second tensor, respectively.

Alternatively, we can multiply the two tensors element-wise:

C[i, j, k, l] = A[i, j, k, l] * B[i, j, k, l]

In this case we "pair" all the axes of the first tensor with the all corresponding axes of the second tensor (first with first, second with second and so on).

Now, I want to do something that is in between the two above described operations. In more details:

  1. I want to pair some axis of the first tensor with some axis of the second tensor (like w do it in tensordot). So, I do not want to pair all the axes of A with all the axis of B.
  2. I do not want to sum up over all the paired axes (like we do in pairwise multiplication, there is no summation over the paired axes).

Here is what I want written in a "mathematical" form:

C[a, b, c, i] = sum_d A[a, b, c, d] * B[i, b, c, d]

What is the best way to do it in theano?

Benjamin W.
  • 46,058
  • 19
  • 106
  • 116
Roman
  • 124,451
  • 167
  • 349
  • 456

1 Answers1

0

The way to approach the described problem is to use the pairwise multiplication *. Pairwise multiplication "pairs" all the axes of the first tensor with the corresponding axes of the second tensor (first with first, second with second, ..., last with last). Therefore we need to solve two problems: (1) shuffle the axes of the two tensors such that the proper axes are paired with each other, (2) add "dummy" axes to prevent pairing where it is not needed. Finally, we do summation over the exes we want.

The particular problem mentioned in the question

C[a, b, c, i] = sum_d A[a, b, c, d] * B[i, b, c, d]

is solved in the following way:

T.sum(A.dimshuffle(0, 1, 2, 3, 'x') * B.dimshuffle('x', 1, 2, 3, 0), axis=4)
Roman
  • 124,451
  • 167
  • 349
  • 456