I'm trying to train CatPole-v0 using Q learning. When trying to update the replay buffer with experience I am getting the following error:
ValueError: Cannot feed value of shape (128,) for Tensor 'Placeholder_1:0', which has shape '(?, 2)'
The related code snippet is:
def update_replay_buffer(replay_buffer, state, action, reward, next_state, done, action_dim):
# append to buffer
experience = (state, action, reward, next_state, done)
replay_buffer.append(experience)
# Ensure replay_buffer doesn't grow larger than REPLAY_SIZE
if len(replay_buffer) > REPLAY_SIZE:
replay_buffer.pop(0)
return None
The placeholder to be fed is
action_in = tf.placeholder("float", [None, action_dim])
Can someone clarify how action_dim should be used to resolve this error?