1

I have a strange error that I don't manage to understand when compiling a scan operator in Theano. When outputs_info is initialized with a last dimension equal to one, I get this error:

TypeError: ('The following error happened while compiling the node', forall_inplace,cpu,
scan_fn}(TensorConstant{4}, IncSubtensor{InplaceSet;:int64:}.0, <TensorType(float32, vector)>), 
'\n', "Inconsistency in the inner graph of scan 'scan_fn' : an input and an output are 
associated with the same recurrent state and should have the same type but have type 
'TensorType(float32, (True,))' and 'TensorType(float32, vector)' respectively.")

while I don't get any error if this dimension is set to anything greater than one.

This error happens on both gpu and cpu target, with theano 0.7, 0.8.0 and 0.8.2.

Here is a piece of code to reproduce the error:

import theano
import theano.tensor as T
import numpy as np

def rec_fun( prev_output, bias):        
    return prev_output + bias

n_steps = 4

# with state_size>1, compilation runs smoothly
state_size = 2

bias = theano.shared(np.ones((state_size),dtype=theano.config.floatX))
(outputs, updates) = theano.scan( fn=rec_fun,
                              sequences=[],
                              outputs_info=T.zeros([state_size,]),
                              non_sequences=[bias],
                              n_steps=n_steps
                              )
print outputs.eval()

# with state_size==1, compilation fails
state_size = 1

bias = theano.shared(np.ones((state_size),dtype=theano.config.floatX))
(outputs, updates) = theano.scan( fn=rec_fun,
                              sequences=[],
                              outputs_info=T.zeros([state_size,]),
                              non_sequences=[bias],
                              n_steps=n_steps
                              )
# compilation fails here
print outputs.eval()

The compilation has thus different behaviors depending on the "state_size". Is there a workaround to handle both case state_size==1 and state_size>1?

1 Answers1

0

Changing

outputs_info=T.zeros([state_size,])

to

outputs_info=T.zeros_like(bias)

makes it work properly for the case of state_size == 1.

Minor explanation and different solution

So I am noticing this crucial difference between the two cases. Add these line of code exactly after the bias declaration line in both cases.

bias = ....
print bias.broadcastable
print T.zeros([state_size,]).broadcastable

The results are

for the first case where your code works

(False,)
(False,)

And for the second case where it seems to break down

(False,)
(True,)

So what happened is that when you added the two tensors of the same dimensions (bias and T.zeros) but with different broadcastable patterns, the pattern that the result inherited was the one from the bias. This ended up causing the misidentification from theano that they are not the same type.

T.zeros_like works because it uses the bias variable to generate the zeros tensor.

Another way to fix your problem is to change the broadcasting pattern like so

outputs_info=T.patternbroadcast(T.zeros([state_size,]), (False,)),
Makis Tsantekidis
  • 2,698
  • 23
  • 38
  • Thank you ! This seems to be a good workaround even if in my particular case where the ouputs_info is in fact dependent on several shape parameters, it makes the code less readable. I'd like to have an explaination about why the straighforward outputs_info=T.zeros([1,]) solution fails... – Romain Hennequin May 27 '16 at 14:00
  • I edited the answer so it contains an explanation and another solution to help you understand the problem – Makis Tsantekidis May 27 '16 at 17:03
  • Thank you so much for your help! it is much clearer too me now. Setting explicitly the broadcast pattern make the code much more readable. Thanks ! – Romain Hennequin May 30 '16 at 07:51
  • If I answered your question please accept the answer :D – Makis Tsantekidis May 30 '16 at 08:48