2

I am using Keras to construct a CNN-LSTM model for tweet classification. The model has two inputs and the task is a three-class classification. The code I use to construct the model is given below:

def conv2d_lstm_with_author():

    # Get the input information - author & tweet
    author_repre_input = Input(shape=(100,), name='author_input')
    tweet_input = Input(shape=(13, 100, 1), name='tweet_input')

    # Create the convolutional layer and lstm layer
    conv2d = Conv2D(filters = 200, kernel_size = (2, 100), padding='same', activation='relu', 
                    use_bias=True, name='conv_1')(tweet_input)
    flat = Flatten(name='flatten_1')(conv2d)
    reshape_flat = Reshape((260000, 1), name='reshape_1')(flat)
    lstm = LSTM(100, return_state=False, activation='tanh', recurrent_activation='hard_sigmoid', name='lstm_1')(reshape_flat)
    concatenate_layer = concatenate([lstm, author_repre_input], axis=1, name='concat_1')
    dense_1 = Dense(10, activation='relu', name='dense_1')(concatenate_layer)
    output = Dense(3, activation='softmax', kernel_regularizer=regularizers.l2(0.01), name='output_dense')(dense_1)

    # Build the model
    model = Model(inputs=[author_repre_input, tweet_input], outputs=output)
    return model

model = conv2d_lstm_with_author()
model.summary()

optimizer = Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.0)
model.compile(optimizer=optimizer, loss='categorical_crossentropy', metrics=['accuracy'])

The shape of my two inputs and label are:

author_repre_input: (40942, 100)
tweet_input: (40942, 13, 100, 1)
my label Train_Y: (40942, 3)

A snapshot of the model summary is:

enter image description here

When I use the following code to train the data:

model.fit([author_repre_input, tweet_input], [Train_Y], epochs=20, batch_size=32, validation_split=0.2, 
          shuffle=False, verbose=2)

The result keeps stucking in the first epoch and the log does not show anything useful, just:

Epoch 1/20

I am wondering why this happens. The version of tensorflow and keras I am using is:

tensorflow - 1.14.0
keras - 2.2.0

Thank you very much for your time!


Update on Jan 20...

I try to use Google Colab to train the model. I check the RAM when running the model. The Colab allocate 25G RAM for me. However, after several seconds of training, the session crashed due to ocupying all available RAM...

enter image description here

I think there must be something wrong with the model part...Any suggestions and insights would be appreciated!

Bright Chang
  • 191
  • 2
  • 14

2 Answers2

5

Fortunately for you, you are not stuck.

The issue comes from the fact that in your model.fit, you specified the parameter verbose=2.

This means that your code will only output messages at the end of an epoch, not informative ones during training progress.

To solve your problem and see training progress, set verbose=1.

Timbus Calin
  • 13,809
  • 5
  • 41
  • 59
  • Wow, thank you for your prompt reply! But when I change the verbose setting to ```verbose=1```, what I could only see is: ```Shell Train on 32753 samples, validate on 8189 samples Epoch 1/20 ``` – Bright Chang Jan 19 '20 at 09:15
  • 1
    Okay, it still means that your problem is solved. You need to wait a bit for the training to start. – Timbus Calin Jan 19 '20 at 09:15
  • Maybe... I don't have any GPU...But thank you for your answer – Bright Chang Jan 19 '20 at 09:17
  • I don't know if this helps or not, even if you are using Google Colab, you can increase the RAM to 32GB whenever the session crashes. So, once you expand the RAM it might help. And please verify if you are using GPU or TPU as GPUs will be faster as compared to TPU in Google Colab – Aditya Bhattacharya Feb 05 '20 at 03:50
1

I think I have found the answer...

The problem is in the convolutional layer. The kernel size is too small, which causes the dimensionality of the output layer is too high. To solve this problem, I change the kernel size from (2, 100) to (3, 100). Furthermore, I also add dropout to my model. The summary of the model I use now is given below:

enter image description here

Now the model could run smoothly in the Google Colab.

Hence, I think if a similar problem occurs, please check the output dimension of each layer. The Keras API may stop in the training epochs if the model creates a very high dimensional output.

Bright Chang
  • 191
  • 2
  • 14