Stuck in the first epoch when training the CNN-LSTM using Keras

Question

I am using Keras to construct a CNN-LSTM model for tweet classification. The model has two inputs and the task is a three-class classification. The code I use to construct the model is given below:

def conv2d_lstm_with_author():

    # Get the input information - author & tweet
    author_repre_input = Input(shape=(100,), name='author_input')
    tweet_input = Input(shape=(13, 100, 1), name='tweet_input')

    # Create the convolutional layer and lstm layer
    conv2d = Conv2D(filters = 200, kernel_size = (2, 100), padding='same', activation='relu', 
                    use_bias=True, name='conv_1')(tweet_input)
    flat = Flatten(name='flatten_1')(conv2d)
    reshape_flat = Reshape((260000, 1), name='reshape_1')(flat)
    lstm = LSTM(100, return_state=False, activation='tanh', recurrent_activation='hard_sigmoid', name='lstm_1')(reshape_flat)
    concatenate_layer = concatenate([lstm, author_repre_input], axis=1, name='concat_1')
    dense_1 = Dense(10, activation='relu', name='dense_1')(concatenate_layer)
    output = Dense(3, activation='softmax', kernel_regularizer=regularizers.l2(0.01), name='output_dense')(dense_1)

    # Build the model
    model = Model(inputs=[author_repre_input, tweet_input], outputs=output)
    return model

model = conv2d_lstm_with_author()
model.summary()

optimizer = Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.0)
model.compile(optimizer=optimizer, loss='categorical_crossentropy', metrics=['accuracy'])

The shape of my two inputs and label are:

author_repre_input: (40942, 100)
tweet_input: (40942, 13, 100, 1)
my label Train_Y: (40942, 3)

A snapshot of the model summary is:

When I use the following code to train the data:

model.fit([author_repre_input, tweet_input], [Train_Y], epochs=20, batch_size=32, validation_split=0.2, 
          shuffle=False, verbose=2)

The result keeps stucking in the first epoch and the log does not show anything useful, just:

Epoch 1/20

I am wondering why this happens. The version of tensorflow and keras I am using is:

tensorflow - 1.14.0
keras - 2.2.0

Thank you very much for your time!

Update on Jan 20...

I try to use Google Colab to train the model. I check the RAM when running the model. The Colab allocate 25G RAM for me. However, after several seconds of training, the session crashed due to ocupying all available RAM...

I think there must be something wrong with the model part...Any suggestions and insights would be appreciated!

You should better ask a separate question for this problem, according to the rules of Stack Overflow. — Timbus Calin, Jan 20 '20 at 07:02

Timbus Calin · Accepted Answer · 2020-01-19T10:35:10.893

5

Fortunately for you, you are not stuck.

The issue comes from the fact that in your model.fit, you specified the parameter verbose=2.

This means that your code will only output messages at the end of an epoch, not informative ones during training progress.

To solve your problem and see training progress, set verbose=1.

edited Jan 19 '20 at 10:35

answered Jan 19 '20 at 09:11

Timbus Calin

13,809
5
41
59

Wow, thank you for your prompt reply! But when I change the verbose setting to ```verbose=1```, what I could only see is: ```Shell Train on 32753 samples, validate on 8189 samples Epoch 1/20 ``` – Bright Chang Jan 19 '20 at 09:15
1

Okay, it still means that your problem is solved. You need to wait a bit for the training to start. – Timbus Calin Jan 19 '20 at 09:15
Maybe... I don't have any GPU...But thank you for your answer – Bright Chang Jan 19 '20 at 09:17
I don't know if this helps or not, even if you are using Google Colab, you can increase the RAM to 32GB whenever the session crashes. So, once you expand the RAM it might help. And please verify if you are using GPU or TPU as GPUs will be faster as compared to TPU in Google Colab – Aditya Bhattacharya Feb 05 '20 at 03:50

score 1 · Answer 2 · answered Feb 05 '20 at 03:45

I think I have found the answer...

The problem is in the convolutional layer. The kernel size is too small, which causes the dimensionality of the output layer is too high. To solve this problem, I change the kernel size from (2, 100) to (3, 100). Furthermore, I also add dropout to my model. The summary of the model I use now is given below:

Now the model could run smoothly in the Google Colab.

Hence, I think if a similar problem occurs, please check the output dimension of each layer. The Keras API may stop in the training epochs if the model creates a very high dimensional output.

if that is true, so how object detection can be trained on Keras :s? — dtlam26, Jan 11 '21 at 01:53

Stuck in the first epoch when training the CNN-LSTM using Keras

2 Answers2