Error when checking target: expected dense_Dense5 to have shape [,1], but got array with shape [3,4]

Question

I've been trying to set up a simple reinforcement learning example using tfjs. However, when trying to train the model I am running into the following error:

Uncaught (in promise) Error: Error when checking target: expected dense_Dense5 to have shape [,1], but got array with shape [3,4]

I built the model up as following:

const NUM_OUTPUTS = 4;

const model = tf.sequential();

//First hidden Layer, which also defines the input shape of the model
model.add(
  tf.layers.dense({
    units: LAYER_1_UNITS,
    batchInputShape: [null, NUM_INPUTS],
    activation: "relu",
  })
);

// Second hidden Layer
model.add(tf.layers.dense({ units: LAYER_2_UNITS, activation: "relu" }));

// Third hidden Layer
model.add(tf.layers.dense({ units: LAYER_3_UNITS, activation: "relu" }));

// Fourth hidden Layer
model.add(tf.layers.dense({ units: LAYER_4_UNITS, activation: "relu" }));

// Defining the output Layer of the model
model.add(tf.layers.dense({ units: NUM_OUTPUTS, activation: "relu" }));

model.compile({
  optimizer: tf.train.adam(),
  loss: "sparseCategoricalCrossentropy",
  metrics: "accuracy",
});

The training is done by a function that calculates the Q-values for some examples:

batch.forEach((sample) => {
  const { state, nextState, action, reward } = sample;
  // We let the model predict the rewards of the current state.
  const current_Q: tf.Tensor = <tf.Tensor>model.predict(state);

  // We also let the model predict the rewards for the next state, if there was a next state in the 
  //game.
  let future_reward = tf.zeros([NUM_ACTIONS]);
  if (nextState) {
    future_reward = <Tensor>model.predict(nextState);
  }

  let totalValue =
    reward + discountFactor * future_reward.max().dataSync()[0];
  current_Q.bufferSync().set(totalValue, 0, action);

  // We can now push the state to the input collector
  x = x.concat(Array.from(state.dataSync()));
  // For the labels/outputs, we push the updated Q values
  y = y.concat(Array.from(current_Q.dataSync()));
});
await model.fit(
  tf.tensor2d(x, [batch.length, NUM_INPUTS]),
  tf.tensor2d(y, [batch.length, NUM_OUTPUTS]),
  {
    batchSize: batch.length,
    epochs: 3,
  }
);

This appeared to be the right way to provide the examples to the fit function, seeing as when logging the model, the shape of the last dense layer is correct:

Log of the shape of dense_Dense5

However it results in the error shown above, where instead of the expected shape [3,4] it checks for the shape [,1]. I really dont understand where this shape is suddenly coming from and would much appreciate some help with this!

For a better overview, you can simply view/check out the whole project from its Github repo:

Github Repo

The tensorflow code in question is in the AI folder.

EDIT:

Providing a summary of the model plus some info of the shape of the tensor im providing for y in model.fit(x,y) :

Could you change `batchInputShape: [null, NUM_INPUTS],` to `batchInputShape: [NUM_INPUTS]`? You do not need to add in `null`. — yudhiesh, Apr 28 '21 at 06:45
@yudhiesh changing batchInputShape to `[NUM_INPUTS]` isnt possible as batchInputShape expects atleast 2 dimensions. I did, however, try to use `inputShape: [NUM_INPUTS]` instead, for which you can define a single dimension and have the batch size automated.This results in the same error though of the Dense5 layer somehow being assumed to have a shape `[,1]` instead of the shape `[batchLength, NUM_OUTPUTS]` that it should have — Jezfromthe6, Apr 28 '21 at 10:08
Ok for `batchInputShape` it should be `[batch_size, input_shape]`. Also is this a classification or regression task? The final layer has a relu activation with 4 outputs. — yudhiesh, Apr 28 '21 at 13:14
@yudhiesh ohh good point! As it is supposed to predict the Q-Values for the actions it is a regression task, not gonna get far with RelU there i suppose. That probably explains the expected output shape of [,1]. Using RelU as output probably expect a single label with a class? I should probably just use no activation on the last layer then to have it just compute values with no activation function applied to them? — Jezfromthe6, Apr 28 '21 at 16:00
Yup for Q-values removing the activation function on the last layer will do it. — yudhiesh, Apr 28 '21 at 16:07
@yudhiesh Great to know! Sadly doesnt fix the issue. I have a feeling that either the way i am providing the batch of Y-values for the output units is wrong or I am somehow missing something on the model setup. Do i need to provide some option to indicate that the model has multiple output units and im providing y-values for each? — Jezfromthe6, Apr 28 '21 at 17:44

score 1 · Answer 1 · answered Apr 28 '21 at 21:21

1

Solved: Issue occured due to using the wrong loss function. Moving from categoricalCrossEntropy to meanSquaredError fixed the issue with the shape of the output layer mismatching the batch shape.

answered Apr 28 '21 at 21:21

Jezfromthe6

21
3

Error when checking target: expected dense_Dense5 to have shape [,1], but got array with shape [3,4]

1 Answers1