10

Consider the following example of numpy broadcasting:

import numpy as np
import theano
from theano import tensor as T

xval = np.array([[1, 2, 3], [4, 5, 6]])
bval = np.array([[10, 20, 30]])
print xval + bval

As expected, the vector bval is added to each rows of the matrix xval and the output is:

[[11 22 33]
 [14 25 36]]

Trying to replicate the same behaviour in the git version of theano:

x = T.dmatrix('x')
b = theano.shared(bval)
z = x + b
f = theano.function([x], z)

print f(xval)

I get the following error:

ValueError: Input dimension mis-match. (input[0].shape[0] = 2, input[1].shape[0] = 1)
Apply node that caused the error: Elemwise{add,no_inplace}(x, <TensorType(int64, matrix)>)
Inputs types: [TensorType(float64, matrix), TensorType(int64, matrix)]
Inputs shapes: [(2, 3), (1, 3)]
Inputs strides: [(24, 8), (24, 8)]
Inputs scalar values: ['not scalar', 'not scalar']

I understand Tensor objects such as x have a broadcastable attribute, but I can't find a way to 1) set this correctly for the shared object or 2) have it correctly inferred. How can I re-implement numpy's behaviour in theano?

mbatchkarov
  • 15,487
  • 9
  • 60
  • 79

1 Answers1

10

Theano need all broadcastable dimensions to be declared in the graph before compilation. NumPy use the run time shape information.

By default, all shared variable dimsions aren't broadcastable, as their shape could change.

To create the shared variable with the broadcastable dimension that you need in your example:

b = theano.shared(bval, broadcastable=(True,False))

I'll add this information to the documentation.

nouiz
  • 5,071
  • 25
  • 21
  • 1
    It is there: http://deeplearning.net/software/theano/library/compile/shared.html#theano.compile.sharedvalue.shared Where you looked for it? Maybe we need to add this in another place too. – nouiz Jan 18 '15 at 16:05
  • ah sorry, it's there in the kwargs... didn't see it. Anyway, I think it's an important feature and would maybe warrant some more visibility? – H. Arponen Jan 18 '15 at 16:58
  • 1
    I added at the end of this section a note (not merge yet, but in a PR) http://deeplearning.net/software/theano/tutorial/examples.html#using-shared-variables – nouiz Jan 19 '15 at 05:44
  • 1
    Are there some restrictions here? For most, but not all broadcasting patterns I get "TypeError: No suitable SharedVariable constructor could be found. Are you sure all kwargs are supported? We do not support the parameter dtype or type.". I didn't realise there were restrictions on valid broadcasting patterns. – zenna Feb 04 '16 at 21:57
  • How to set the broadcastable attribute for subtensors that are a resulting from selecting indices, and that are not shared? Thanks! See my question here: http://stackoverflow.com/questions/38309429/broadcasting-for-subtensor-created-from-matrix-theano – benroth Jul 13 '16 at 08:29