0

I’m training an A3C agent using RLLib, my observations are 2D (time steps, features), so my first layer should be a LSTM followed by another LSTM and then a fully connected layer before the final layer. I’m using the following config for that:

config[“num_gpus”] = 1
config[“model”] = {
“use_lstm”: True,
“lstm_use_prev_action”: True,
“lstm_use_prev_reward”: True,
“lstm_cell_size”: 64,
“fcnet_hiddens”:[],

What I believe it is not producing my desired model architecture, but unfortunately I could not manage to see the model.summary() since I'm getting the following:

AttributeError: ‘ComplexInputNetwork_as_LSTMWrapper’ object has no attribute ‘summary’

Anyhow I moved ahead and trained the agent above with a reasonable reward, saved the checkpoint but I’m not being able to use the trained model for prediction in production. Maybe it could be also related to the fact of using the LSTMWrapper. Any help is appreciated since I’m stuck.

MY CODE:

my_restored_policy = Policy.from_checkpoint(“checkpoint/checkpoint_100000/policies/default_policy”)

env=MyEnv()
_=env.reset()

obs = np.array(env.updateState())

action = my_restored_policy.compute_single_action(obs)

ERROR:

tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root error(s) found.
(0) Invalid argument: You must feed a value for placeholder tensor ‘var_scope_1/state_in_0’ with dtype float and shape [?,64]
[[node var_scope_1/state_in_0 (defined at usr/local/lib/python3.6/dist-packages/ray/rllib/utils/tf_utils.py:217) ]]
[[var_scope_1/model_2/lstm/ExpandDims/_211]]
(1) Invalid argument: You must feed a value for placeholder tensor ‘var_scope_1/state_in_0’ with dtype float and shape [?,64]
[[node var_scope_1/state_in_0 (defined at usr/local/lib/python3.6/dist-packages/ray/rllib/utils/tf_utils.py:217) ]]
0 successful operations.
0 derived errors ignored.

I tried saving the model to load it directly in Keras, but my_restored_policy.export_model do not give me any error and do not generate any file either.

ray
  • 11,310
  • 7
  • 18
  • 42
Felipe
  • 1

0 Answers0