0

I am exploring rllib, training atari ‘breakout’. The difficulty I face is that the making the trained agent to play the game.

The First Problem is that the Agent.compute_single_action(obs) doesn’t automatically preprocess the (1, 210, 160, 3) atari.

So, because of that I tried to use rllib's built-in preprocessor to do so. But it converts the input format (1, 210, 160, 3) to (1, 84, 84, 3) but the expected input is (?, 84, 84, 4) for the PPO neural network algorithm.

how do I change the framestack from 3 to 4 in rllib?

env = gym.make('Breakout-v4', render_mode='rgb_array')
env.reset()

observation = env.reset()

for i in range(200):
    show_render_4(env)
    video.capture_frame()
    action = Agent.compute_single_action(observation)
    observation, reward, done, info = env.step(action)
    if done:
        break
env.close()

compute_single_action

from ray.rllib.models.preprocessors import get_preprocessor

env = gym.make('Breakout-v4', render_mode='rgb_array')
env.reset()

observation = env.reset()
prep = get_preprocessor(env.observation_space)(env.observation_space)
obs = prep.transform(observation)

for i in range(200):
    action = Agent.compute_action(obs)
    observation, reward, done, info = env.step(action)
    prep = get_preprocessor(env.observation_space)(env.observation_space)
    obs = prep.transform(observation)
    if done:
        break

env.close()

with using rllib built-in preprocessor

Ray Version==2.0.1 Algorithm Used ==> PPOTrainer

desertnaut
  • 57,590
  • 26
  • 140
  • 166

0 Answers0