0

I'm trying to adapt the Q learning example from https://github.com/lanking520/RL-FlappyBird to play a different game, Pathery.

When calculating Q, I get an error about shape mismatch.

(QAgent.java L95)

  NDList QReward = trainer.forward(preInput);
  NDList targetQReward = trainer.forward(postInput);

  NDList Q = new NDList(QReward.singletonOrThrow()
      .mul(actionInput.singletonOrThrow())
      .sum(new int[]{1}));

specifically at

.mul(actionInput.singletonOrThrow())

MXNetError: Check failed: l == 1 || r == 1: operands could not be broadcast together with shapes [32,2] [32,153]

The original code had an action space of size 2, while now the action space has size 153 and this does not seem to work that way.

How can I calculate Q here with an action space larger than 2?

0 Answers0