-3

I have neural network in Keras using LSTM for training chatbot.

contextTrain, contextTest, utteranceTrain, utteranceTest = train_test_split(context, utterance, test_size=0.1, random_state=1)
model = Sequential()
model.add(LSTM(input_shape=contextTrain.shape[1:], return_sequences=True, units=300, activation="sigmoid", kernel_initializer="glorot_normal", recurrent_initializer="glorot_normal"))
model.add(LSTM(return_sequences=True, units=300, activation="sigmoid", kernel_initializer="glorot_normal", recurrent_initializer="glorot_normal"))
model.compile(loss="cosine_proximity", optimizer="adam", metrics=["accuracy"])
model.fit(contextTrain, utteranceTrain, epochs=5000, validation_data=(contextTest, utteranceTest), callbacks=[ModelCheckpoint("model{epoch:02d}.h5", monitor='val_acc', save_best_only=true, mode='max')])

Context and utterance are numpy arrays with shape e.g. (1000, 10, 300) or (10000, 10, 300). Input_shape of first LSTM should be (10, 300). Vector with size 300 is word represented by word embedding created by Word2vec model. So input data are 10 of these vectors in this example.

The biggest problem is that loss and val_loss are both almost steadily increasing during training.

Epoch 1/5000
900/900 [==============================] - 18s 20ms/step - loss: -0.5855 - acc: 0.0220 - val_loss: -0.6527 - val_acc: 0.0260
Epoch 2/5000
900/900 [==============================] - 13s 14ms/step - loss: -0.6299 - acc: 0.0239 - val_loss: -0.6673 - val_acc: 0.0240
Epoch 3/5000
900/900 [==============================] - 12s 14ms/step - loss: -0.6387 - acc: 0.0213 - val_loss: -0.6764 - val_acc: 0.0160
Epoch 4/5000
900/900 [==============================] - 12s 13ms/step - loss: -0.6457 - acc: 0.0229 - val_loss: -0.6821 - val_acc: 0.0240
Epoch 5/5000
900/900 [==============================] - 12s 14ms/step - loss: -0.6497 - acc: 0.0274 - val_loss: -0.6873 - val_acc: 0.0230
Epoch 6/5000
900/900 [==============================] - 14s 15ms/step - loss: -0.6507 - acc: 0.0276 - val_loss: -0.6874 - val_acc: 0.0240
Epoch 7/5000
900/900 [==============================] - 15s 16ms/step - loss: -0.6517 - acc: 0.0279 - val_loss: -0.6877 - val_acc: 0.0260
Epoch 8/5000
900/900 [==============================] - 14s 16ms/step - loss: -0.6526 - acc: 0.0272 - val_loss: -0.6875 - val_acc: 0.0230
Epoch 9/5000
900/900 [==============================] - 14s 16ms/step - loss: -0.6530 - acc: 0.0274 - val_loss: -0.6879 - val_acc: 0.0240
Epoch 10/5000
900/900 [==============================] - 14s 15ms/step - loss: -0.6530 - acc: 0.0278 - val_loss: -0.6871 - val_acc: 0.0230

What is the possible reason of loss and val_loss changing this way and not decreasing? Is there something wrong with neural network, training data or something else?

If you need any further informations I will provide them.

Thank you for any reply

Spook
  • 331
  • 1
  • 6
  • 16

1 Answers1

2

You are using the cosine_proximity loss function of keras. This loss is 1 of the output does not match the target at all but is -1 if the target matches the output perfectly (see this and this). Therefore, a value which is converging to -1 is a good sign, as the actual difference between the target and the actual output is decreasing.

zimmerrol
  • 4,872
  • 3
  • 22
  • 41