I want to model my environment such that each action is made of 3 possible sub-actions.
I've defined the _action_spec
of my tf_agents.environments.py_environment.PyEnvironment
as:
self._action_spec = tf_agents.specs.BoundedArraySpec(
shape=(3,), dtype=np.int32, name="action", minimum=[0, 0, 0], maximum=[10, 11, 12])
I'm failing in the step
method, I'm trying:
env = NetworkEnv(discount=0.9)
tf_env = TFPyEnvironment(env)
print(tf_env.reset())
action = tf.constant([3, 3, 3], dtype=tf.int32, shape=(3,), name='action')
print(tf_env.step(action))
tf_env.close()
But it gives ValueError: cannot select an axis to squeeze out which has size not equal to one
How am I suppose to feed the step
method with the action?