2

I want to model my environment such that each action is made of 3 possible sub-actions.

I've defined the _action_spec of my tf_agents.environments.py_environment.PyEnvironment as:

self._action_spec = tf_agents.specs.BoundedArraySpec(
            shape=(3,), dtype=np.int32, name="action", minimum=[0, 0, 0], maximum=[10, 11, 12])

I'm failing in the step method, I'm trying:

env = NetworkEnv(discount=0.9)
tf_env = TFPyEnvironment(env)
print(tf_env.reset())
action = tf.constant([3, 3, 3], dtype=tf.int32, shape=(3,), name='action')
print(tf_env.step(action))
tf_env.close()

But it gives ValueError: cannot select an axis to squeeze out which has size not equal to one

How am I suppose to feed the step method with the action?

Lostefra
  • 350
  • 4
  • 13
  • Comes to mind that you can not set different minimum and maximum values for each sub-action. Not this way, at least. I'm not sure about what is going on, but can you please share your environment class? How is the _step() function inside it? – HWerneck Jul 27 '21 at 15:06
  • I say this because I had a similar thing with different shapes of sub-observations, and there is a way of doing that. Plus, [this link](https://www.tensorflow.org/agents/tutorials/3_policies_tutorial) shows different examples of multiple actions, like `action_spec = array_spec.BoundedArraySpec((2,), np.int32, -10, 10)` which consists of two actions. None of the examples have the minimum and maximum presented per sub-action as you implemented. – HWerneck Jul 27 '21 at 15:12

1 Answers1

0

The function squeeze() requires an additional axis from the action shape, so try to add an axis to your action as follows:

action = tf.constant([[3, 3, 3]], dtype=tf.int32, shape=(1,3), name='action')