0

I am working with th new version of keras-rl2, trying to train my DQN agent. I have trouble with the fit function - https://github.com/tensorneko/keras-rl2/blob/master/rl/core.py . This is the documentation for class Agent (line 147 --> env.step()) This method is returning more than 4 values, not sure why. I have trouble working with new and old versions of gym along with keras-rl. Has anyone resolved this issue? If so, please let me know what gym version you used to train the DQN agent, or how to handle the return value for the fit function. You can refer the full code in this question here - AttributeError: 'tuple' object has no attribute '__array_interface__'

pip show gym
Name: gym
Version: 0.26.2
Summary: Gym: A universal API for reinforcement learning environments
Home-page: https://www.gymlibrary.dev/
Author: Gym Community
Author-email: jkterry@umd.edu
License: MIT
Location: /home/harsh/.local/lib/python3.10/site-packages
Requires: cloudpickle, gym-notices, numpy
Required-by: 
Note: you may need to restart the kernel to use updated packages.
!git clone https://github.com/wau/keras-rl2.git
!cd /home/'user_name'/keras-rl
env = gym.make("Breakout-v4")
nb_actions = env.action_space.n

pip show keras

Name: keras Version: 2.12.0 Summary: Deep learning for humans. Home-page: https://keras.io/ Author: Keras team Author-email: keras-users@googlegroups.com License: Apache 2.0 Location: /home/harsh/.local/lib/python3.10/site-packages Requires: Required-by: keras-rl, tensorflow Note: you may need to restart the kernel to use updated packages.

# Load the weights
model.load_weights("weights/dqn_BreakoutDeterministic-v4_weights_900000.h5f")

# Update the policy to start with a smaller epsilon
policy = LinearAnnealedPolicy(EpsGreedyQPolicy(), attr='eps', value_max=0.3, value_min=.1, value_test=.05,
                              nb_steps=100000)


# Initialize the DQNAgent with the new model and updated policy and compile it
dqn = DQNAgent(model=model, nb_actions=nb_actions, policy=policy, memory=memory,
               processor=processor, nb_steps_warmup=50000, gamma=.99, target_model_update=10000)
dqn.compile(Adam(learning_rate=.00025), metrics=['mae'])

# And train the model
dqn.fit(env, nb_steps=500000, callbacks=[checkpoint_callback], log_interval=10000, visualize=False)

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[25], line 15
     12 dqn.compile(Adam(learning_rate=.00025), metrics=['mae'])
     14 # And train the model
---> 15 dqn.fit(env, nb_steps=500000, callbacks=[checkpoint_callback], log_interval=10000, visualize=False)

File ~/.local/lib/python3.10/site-packages/rl/core.py:177, in Agent.fit(self, env, nb_steps, action_repetition, callbacks, verbose, visualize, nb_max_start_steps, start_step_policy, log_interval, nb_max_episode_steps)
    175 for _ in range(action_repetition):
    176     callbacks.on_action_begin(action)
--> 177     observation, r, done, info = env.step(action)
    178     observation = deepcopy(observation)
    179     if self.processor is not None:

ValueError: too many values to unpack (expected 4)
Progman
  • 16,827
  • 6
  • 33
  • 48

1 Answers1

0

env.step unpacks 5 values instead. These are, in order, observation, reward, terminated, truncated and info, such that

observation, r, done, truncated, info = env.step(action)

While terminated and truncated both end the environment, they have different meanings. terminated is a boolean that is True when the agent reaches a terminal state. truncated is a boolean that is True when there is an additional restraint in the environment. This could be a time limit, or the agent walking outside of the environment.

I advice you to use Stable Baselines version <2.0.0 for a DQN agent. Stable Baselines 3 works great with Gym and the agents can be extended and adapted to your liking.

Lexpj
  • 921
  • 2
  • 6
  • 15
  • Hey, could you let me know how to install Stable Baselines version <2.0.0 for a DQN agent? This would mean installing the older version of Keras rl I believe. – Harshith RM May 15 '23 at 19:45
  • I am not sure if I should downgrade keras or keras-rl. Also I am unable to find the documentation for the older versions of keras and keras-rl. – Harshith RM May 15 '23 at 19:55
  • The reason I suggested version <2.0.0 is because 2.0.0 was just released and uses the new Gymnasium, which could lead to some problems. Here is the ReadTheDocs of DQN of Stable Baselines 3: https://stable-baselines3.readthedocs.io/en/master/modules/dqn.html Let me know if this suffices – Lexpj May 15 '23 at 20:04