4

Say I have two tensors in tensorflow, with the first dimension representing the index of a training example in a batch, and the others representing some vectors of matrices of data. Eg

vector_batch = tf.ones([64, 50])
matrix_batch = tf.ones([64, 50, 50])

I'm curious what the most idiomatic way to perform a vector*matrix multiply, for each of the pairs of vectors, matrices that share an index along the first dimension.

Aka a the most idiomatic way to write:

result = tf.empty([64,50])
for i in range(64):
    result[i,:] = tf.matmul(vector_batch[i,:], matrix_batch[i,:,:])

What would be the best way to organize the shape of the input vectors to make this process as simple/clean as possible?

Taaam
  • 1,098
  • 3
  • 11
  • 23

2 Answers2

6

Probably the most idiomatic way to do this is using tf.batch_matmul() operator (in conjunction with tf.expand_dims() and tf.squeeze():

vector_batch = tf.placeholder(tf.float32, shape=[64, 50])
matrix_batch = tf.placeholder(tf.float32, shape=[64, 50, 50])

vector_batch_as_matrices = tf.expand_dims(vector_batch, 1)
# vector_batch_as_matrices.get_shape() ==> [64, 1, 50]

result = tf.batch_matmul(vector_batch_as_matrices, matrix_batch)
# result.get_shape() ==> [64, 1, 50]

result = tf.squeeze(result, [1])
# result.get_shape() ==> [64, 50]
qwerty
  • 101
  • 1
  • 9
mrry
  • 125,488
  • 26
  • 399
  • 400
0

It appears that tf.matmul already supports this type of "nd" operation: https://stackoverflow.com/a/43819275/2930156

John Jiang
  • 827
  • 1
  • 9
  • 19