I am exploring rllib, training atari ‘breakout’. The difficulty I face is that the making the trained agent to play the game.
The First Problem is that the Agent.compute_single_action(obs) doesn’t automatically preprocess the (1, 210, 160, 3) atari.
So, because of that I tried to use rllib's built-in preprocessor to do so. But it converts the input format (1, 210, 160, 3) to (1, 84, 84, 3) but the expected input is (?, 84, 84, 4) for the PPO neural network algorithm.
how do I change the framestack from 3 to 4 in rllib?
env = gym.make('Breakout-v4', render_mode='rgb_array')
env.reset()
observation = env.reset()
for i in range(200):
show_render_4(env)
video.capture_frame()
action = Agent.compute_single_action(observation)
observation, reward, done, info = env.step(action)
if done:
break
env.close()
from ray.rllib.models.preprocessors import get_preprocessor
env = gym.make('Breakout-v4', render_mode='rgb_array')
env.reset()
observation = env.reset()
prep = get_preprocessor(env.observation_space)(env.observation_space)
obs = prep.transform(observation)
for i in range(200):
action = Agent.compute_action(obs)
observation, reward, done, info = env.step(action)
prep = get_preprocessor(env.observation_space)(env.observation_space)
obs = prep.transform(observation)
if done:
break
env.close()
with using rllib built-in preprocessor
Ray Version==2.0.1 Algorithm Used ==> PPOTrainer