1

In the Theano deep learning tutorial, y is a shared variable that is casted:

   y = theano.shared(numpy.asarray(data, dtype=theano.config.floatX))
   y = theano.tensor.cast(y, 'int32')

I later want to set a new value for y.

For GPU this works:

    y.owner.inputs[0].owner.inputs[0].set_value(np.asarray(data2, dtype=theano.config.floatX))

For CPU this works:

    y.owner.inputs[0].set_value(np.asarray(data2, dtype=theano.config.floatX))

Why does this require a different syntax between GPU and CPU? I would like my code to work for both cases, am I doing it wrong?

Amir
  • 10,600
  • 9
  • 48
  • 75
o17t H1H' S'k
  • 2,541
  • 5
  • 31
  • 52

1 Answers1

4

This is a very similar problem to that described in another StackOverflow question.

The problem is that you are using a symbolic cast operation which turns the shared variable into a symbolic variable.

The solution is to cast the shared variable's value rather than the shared variable itself.

Instead of

y = theano.shared(numpy.asarray(data, dtype=theano.config.floatX))
y = theano.tensor.cast(y, 'int32')

Use

y = theano.shared(numpy.asarray(data, dtype='int32'))

Navigating the Theano computational graph via the owner attribute is considered bad form. If you want to alter the shared variable's value, maintain a Python reference to the shared variable and set its value directly.

So, with y being just a shared variable, and not a symbolic variable, you can now just do:

y.set_value(np.asarray(data2, dtype='int32'))

Note that the casting is happening in numpy again, instead of Theano.

Community
  • 1
  • 1
Daniel Renshaw
  • 33,729
  • 8
  • 75
  • 94
  • 1
    The weird thing in the OP is that there is a difference between CPU and GPU - GPU seems to add an extra node, making it necessary to go back two steps instead of one. Do you have any idea where this may come from? – eickenberg Jun 25 '15 at 12:16
  • 2
    It's probably a `HostFromGpu` or `GpuFromHost` operation. If the graph is printed via `theano.printing.debugprint` then you'll be able to see what the extra operation is. This is part of why it's best to avoid accessing inputs/outputs/owner when not required. – Daniel Renshaw Jun 25 '15 at 13:27
  • so the tutorial does the casting that way since: # When storing data on the GPU it has to be stored as floats # therefore we will store the labels as ``floatX`` as well # (``shared_y`` does exactly that). But during our computations # we need them as ints (we use labels as index, and if they are # floats it doesn't make sense) therefore instead of returning # ``shared_y`` we will have to cast it to int. This little hack # lets ous get around this issue – o17t H1H' S'k Jun 29 '15 at 20:52