1

I've been trying to set up a simple reinforcement learning example using tfjs. However, when trying to train the model I am running into the following error:

Uncaught (in promise) Error: Error when checking target: expected dense_Dense5 to have shape [,1], but got array with shape [3,4]

I built the model up as following:

const NUM_OUTPUTS = 4;

const model = tf.sequential();

//First hidden Layer, which also defines the input shape of the model
model.add(
  tf.layers.dense({
    units: LAYER_1_UNITS,
    batchInputShape: [null, NUM_INPUTS],
    activation: "relu",
  })
);

// Second hidden Layer
model.add(tf.layers.dense({ units: LAYER_2_UNITS, activation: "relu" }));

// Third hidden Layer
model.add(tf.layers.dense({ units: LAYER_3_UNITS, activation: "relu" }));

// Fourth hidden Layer
model.add(tf.layers.dense({ units: LAYER_4_UNITS, activation: "relu" }));

// Defining the output Layer of the model
model.add(tf.layers.dense({ units: NUM_OUTPUTS, activation: "relu" }));

model.compile({
  optimizer: tf.train.adam(),
  loss: "sparseCategoricalCrossentropy",
  metrics: "accuracy",
});

The training is done by a function that calculates the Q-values for some examples:

batch.forEach((sample) => {
  const { state, nextState, action, reward } = sample;
  // We let the model predict the rewards of the current state.
  const current_Q: tf.Tensor = <tf.Tensor>model.predict(state);

  // We also let the model predict the rewards for the next state, if there was a next state in the 
  //game.
  let future_reward = tf.zeros([NUM_ACTIONS]);
  if (nextState) {
    future_reward = <Tensor>model.predict(nextState);
  }

  let totalValue =
    reward + discountFactor * future_reward.max().dataSync()[0];
  current_Q.bufferSync().set(totalValue, 0, action);

  // We can now push the state to the input collector
  x = x.concat(Array.from(state.dataSync()));
  // For the labels/outputs, we push the updated Q values
  y = y.concat(Array.from(current_Q.dataSync()));
});
await model.fit(
  tf.tensor2d(x, [batch.length, NUM_INPUTS]),
  tf.tensor2d(y, [batch.length, NUM_OUTPUTS]),
  {
    batchSize: batch.length,
    epochs: 3,
  }
);

This appeared to be the right way to provide the examples to the fit function, seeing as when logging the model, the shape of the last dense layer is correct:

Log of the shape of dense_Dense5

However it results in the error shown above, where instead of the expected shape [3,4] it checks for the shape [,1]. I really dont understand where this shape is suddenly coming from and would much appreciate some help with this!

For a better overview, you can simply view/check out the whole project from its Github repo:

Github Repo

The tensorflow code in question is in the AI folder.

EDIT:

Providing a summary of the model plus some info of the shape of the tensor im providing for y in model.fit(x,y) :

model summary

labels

  • Could you change `batchInputShape: [null, NUM_INPUTS],` to `batchInputShape: [NUM_INPUTS]`? You do not need to add in `null`. – yudhiesh Apr 28 '21 at 06:45
  • @yudhiesh changing batchInputShape to `[NUM_INPUTS]` isnt possible as batchInputShape expects atleast 2 dimensions. I did, however, try to use `inputShape: [NUM_INPUTS]` instead, for which you can define a single dimension and have the batch size automated.This results in the same error though of the Dense5 layer somehow being assumed to have a shape `[,1]` instead of the shape `[batchLength, NUM_OUTPUTS]` that it should have – Jezfromthe6 Apr 28 '21 at 10:08
  • Ok for `batchInputShape` it should be `[batch_size, input_shape]`. Also is this a classification or regression task? The final layer has a relu activation with 4 outputs. – yudhiesh Apr 28 '21 at 13:14
  • @yudhiesh ohh good point! As it is supposed to predict the Q-Values for the actions it is a regression task, not gonna get far with RelU there i suppose. That probably explains the expected output shape of [,1]. Using RelU as output probably expect a single label with a class? I should probably just use no activation on the last layer then to have it just compute values with no activation function applied to them? – Jezfromthe6 Apr 28 '21 at 16:00
  • Yup for Q-values removing the activation function on the last layer will do it. – yudhiesh Apr 28 '21 at 16:07
  • @yudhiesh Great to know! Sadly doesnt fix the issue. I have a feeling that either the way i am providing the batch of Y-values for the output units is wrong or I am somehow missing something on the model setup. Do i need to provide some option to indicate that the model has multiple output units and im providing y-values for each? – Jezfromthe6 Apr 28 '21 at 17:44

1 Answers1

1

Solved: Issue occured due to using the wrong loss function. Moving from categoricalCrossEntropy to meanSquaredError fixed the issue with the shape of the output layer mismatching the batch shape.