How do I index from another array into a tensor tensorflow

Question

I am trying to write a deep q-learning network for a problem in AI. I have a function predict() that produces a tensor of shape (None, 3) taking in an input of shape (None, 5). The 3 in (None, 3) corresponds to the q-value of each action that can be taken at each state. Now, in the training step, I have to call predict() multiple times and use the result to compute the cost and train the model. For doing this, I also have another data array available called current_actions which is a list containing indices of actions taken for a particular state in the previous iterations.

What needs to happen is current_states_outputs should be a tensor created from the output of predict() in which each row contains only one q-value(as opposed to three from the output of predict()) and which q-value should be selected should depend on the corresponding index of current_actions.

For example, if current_states_output = [[1,2,3],[4,5,6],[7,8,9]] and current_actions=[0,2,1], the result after the operation should be [1,6,8] (updated)

How do I do this?

I have tried the following -

    current_states_outputs = self.sess.run(self.prediction, feed_dict={self.X:current_states})
    current_states_outputs = np.array([current_states_outputs[a][current_actions[a]] for a in range(len(current_actions))])

I basally ran the session on predict() and did the required using normal python methords. But because this severs the connection of the cost from the previous layers of the graph, no training can be done. So, I need to do this operation staying within tensorflow and keeping everything as a tensorflow tensor itself. How can I manage this?

score 2 · Accepted Answer · answered Aug 17 '17 at 16:59

2

You can try,

tf.squeeze(tf.gather_nd(a,tf.stack([tf.range(b.shape[0])[...,tf.newaxis], b[...,tf.newaxis]], axis=2)))

The sample code:

a = tf.Variable(current_states_outputs)
b = tf.Variable(current_actions)
out = tf.squeeze(tf.gather_nd(a,tf.stack([tf.range(b.shape[0])[...,tf.newaxis], b[...,tf.newaxis]], axis=2)))
sess = tf.InteractiveSession()
tf.global_variables_initializer().run()
sess.run(out)

#output
[1, 6, 8]

answered Aug 17 '17 at 16:59

Vijay Mariappan

16,921
3
40
59

It is producing a value error saying `ValueError: Shapes must be equal rank, but are 2 and 3 From merging shape 0 with other shapes. for 'stack_1' (op: 'Pack') with input shapes: [100,1], [100,1,1].` I tried out with inputs given as `current_states_outputs = np.random.rand(100, 3)` and `current_actions = np.random.randint(0,3,(100,1))` – Ananda Aug 17 '17 at 17:16
The above code works for the example you provided. In your case looks like b[...,tf.newaxis] should be replaced with b. – Vijay Mariappan Aug 17 '17 at 17:22
Thank you. Replacing `b[...,tf.newaxis]` with b did it. – Ananda Aug 17 '17 at 17:29

How do I index from another array into a tensor tensorflow

1 Answers1