0

I followed the following tutorial to implement the taxi domain and DQN.

However, when predicting values for a batch, all inputs get the same value.

Assume that the input for the embedding layer has the form [float] in the interval [0, 1] and can assume 500 possible values. The batch has a size 16, so a batch can be, for instance, like this:

 [[0.206]
 [0.816]
 [0.768]
 [0.046]
 [0.902]
 [0.384]
 [0.302]
 [0.984]
 [0.588]
 [0.524]
 [0.164]
 [0.102]
 [0.606]
 [0.224]
 [0.728]
 [0.566]]

However, when using model.predict_on_batch(batch) all the predictions assume the same value:

[[-0.01847944 -0.04587542 -0.01173695 -0.04059657  0.00310457  0.01856036]
 [-0.01847944 -0.04587542 -0.01173695 -0.04059657  0.00310457  0.01856036]
 [-0.01847944 -0.04587542 -0.01173695 -0.04059657  0.00310457  0.01856036]
 [-0.01847944 -0.04587542 -0.01173695 -0.04059657  0.00310457  0.01856036]
 [-0.01847944 -0.04587542 -0.01173695 -0.04059657  0.00310457  0.01856036]
 [-0.01847944 -0.04587542 -0.01173695 -0.04059657  0.00310457  0.01856036]
 [-0.01847944 -0.04587542 -0.01173695 -0.04059657  0.00310457  0.01856036]
 [-0.01847944 -0.04587542 -0.01173695 -0.04059657  0.00310457  0.01856036]
 [-0.01847944 -0.04587542 -0.01173695 -0.04059657  0.00310457  0.01856036]
 [-0.01847944 -0.04587542 -0.01173695 -0.04059657  0.00310457  0.01856036]
 [-0.01847944 -0.04587542 -0.01173695 -0.04059657  0.00310457  0.01856036]
 [-0.01847944 -0.04587542 -0.01173695 -0.04059657  0.00310457  0.01856036]
 [-0.01847944 -0.04587542 -0.01173695 -0.04059657  0.00310457  0.01856036]
 [-0.01847944 -0.04587542 -0.01173695 -0.04059657  0.00310457  0.01856036]
 [-0.01847944 -0.04587542 -0.01173695 -0.04059657  0.00310457  0.01856036]
 [-0.01847944 -0.04587542 -0.01173695 -0.04059657  0.00310457  0.01856036]]

And this is the network architecture:

model = Sequential()  
model.add(Embedding(500, 10, input_length=1))
model.add(Reshape((10,)))
model.add(Dense(32, input_shape=(1,), activation='relu'))
model.add(Dense(16, activation='relu'))
model.add(Dense(6, activation='linear'))

model.compile(loss='mse', optimizer=Adam(learning_rate=LEARNING_RATE))

Why the inputs are not assuming different predicted values?

HenDoNR
  • 79
  • 1
  • 12

0 Answers0