0

In Theano, there is an option to use the repeat function T.repeat(A,B) and supply a pair of vectors, such that each element of A[i] is repeated B[i] times.

Unfortunately, this operation has no defined gradient (it throws a notimplemented exception) which is a problem, as I'm trying to use this with Pymc3's gradient based samplers.

I think I can address this using the scan function and calling repeat recursively for each element of the two vectors, however my code isn't working, probably because I'm calling scan incorrectly. Can anyone help me understand why the below code isn't working?

A = T.dvector('A')
B = T.ivector('B')
A.tag.test_value = np.array(np.random.rand(2), dtype = "float32")
B.tag.test_value = np.array(np.random.rand(2), dtype = "int32")
th.config.compute_test_value = 'warn'

results, updates = th.scan(fn = lambda prior_result, A, B: A.repeat(B),
                          sequences = [A, B],
                          outputs_info = T.constant([1,4,4,4]))

b = th.function(inputs=[A,B], outputs=results.flatten())
b([1],[4])

I'd expect this to return [1,1,1,1] but instead it returns the below error.

    395     except AttributeError:
    396         return _wrapit(a, 'repeat', repeats, axis)
--> 397     return repeat(repeats, axis)
    398 
    399 

ValueError: operands could not be broadcast together with shape (1,) (4,)

I've raised an issue on the Pymc3 github to see if this is something that should be fixed more permanently, but I figure its a good opportunity to learn more about Theano for me anyway, and if I can resolve the problem maybe I can contribute back to the project.

analystic
  • 351
  • 5
  • 17

1 Answers1

0

I see here two things:

  1. Bad ordering in lambda expression: it should be A, B, prior_result (now B is treated as outputs_info)
  2. shape of A.repeat(B) is different from shape of prior_result (at this stage of compilation)

quick fix: just remove outputs_info from scan's arguments (and prior_result from lambda) and you will get [1,1,1,1].