1

I am training an LSTM model on the SemEval 2017 task 4A dataset. I observe that first validation accuracy increases along with training accuracy but then suddenly decreases by a significant amount. The loss decreases but validation loss increases by a signifcant amount.

The training sample

Here is the code of my model

model = Sequential()
model.add(Embedding(max_words, 30, input_length=max_len))
model.add(BatchNormalization())
model.add(Activation('tanh'))
model.add(Dropout(0.3))
model.add(Bidirectional(LSTM(32)))
model.add(BatchNormalization())
model.add(Activation('tanh'))
model.add(Dropout(0.5))
model.add(Dense(1, activation='sigmoid'))
model.summary()

And here is the model summary

Model: "sequential_2"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
embedding_2 (Embedding)      (None, 300, 30)           60000     
_________________________________________________________________
batch_normalization_3 (Batch (None, 300, 30)           120       
_________________________________________________________________
activation_3 (Activation)    (None, 300, 30)           0         
_________________________________________________________________
dropout_3 (Dropout)          (None, 300, 30)           0         
_________________________________________________________________
bidirectional_2 (Bidirection (None, 64)                16128     
_________________________________________________________________
batch_normalization_4 (Batch (None, 64)                256       
_________________________________________________________________
activation_4 (Activation)    (None, 64)                0         
_________________________________________________________________
dropout_4 (Dropout)          (None, 64)                0         
_________________________________________________________________
dense_2 (Dense)              (None, 1)                 65        
=================================================================
Total params: 76,569
Trainable params: 76,381
Non-trainable params: 188

I am using GloVe for word embeddings, Adam optimizer, Binary Crossentropy loss function.

PeakyBlinder
  • 1,059
  • 1
  • 14
  • 35

1 Answers1

2

You have a few choices:

  1. keep training and see what happens
  2. if the val_loss become worse, you're overfitting -- check out how to deal with that -- increase the amount of the data, make a simpler network or do whatever seems to work in your particular case.
  3. if the val_loss gets better back again, you're on the right path.

And, yeah, share results with us, what happens if you run training for a couple more epochs?

lenik
  • 23,228
  • 4
  • 34
  • 43
  • I trained them for 20 epochs and what I observed was that validation accuracy went up but later it came down and then after 3-4 epochs rose. Same with validation loss, it decreased and then increased. It's going like a sinusoidal wave. – PeakyBlinder May 30 '20 at 17:24