In Theano, there is an option to use the repeat function T.repeat(A,B)
and supply a pair of vectors, such that each element of A[i]
is repeated B[i]
times.
Unfortunately, this operation has no defined gradient (it throws a notimplemented exception) which is a problem, as I'm trying to use this with Pymc3
's gradient based samplers.
I think I can address this using the scan
function and calling repeat recursively for each element of the two vectors, however my code isn't working, probably because I'm calling scan
incorrectly. Can anyone help me understand why the below code isn't working?
A = T.dvector('A')
B = T.ivector('B')
A.tag.test_value = np.array(np.random.rand(2), dtype = "float32")
B.tag.test_value = np.array(np.random.rand(2), dtype = "int32")
th.config.compute_test_value = 'warn'
results, updates = th.scan(fn = lambda prior_result, A, B: A.repeat(B),
sequences = [A, B],
outputs_info = T.constant([1,4,4,4]))
b = th.function(inputs=[A,B], outputs=results.flatten())
b([1],[4])
I'd expect this to return [1,1,1,1] but instead it returns the below error.
395 except AttributeError:
396 return _wrapit(a, 'repeat', repeats, axis)
--> 397 return repeat(repeats, axis)
398
399
ValueError: operands could not be broadcast together with shape (1,) (4,)
I've raised an issue on the Pymc3 github to see if this is something that should be fixed more permanently, but I figure its a good opportunity to learn more about Theano for me anyway, and if I can resolve the problem maybe I can contribute back to the project.