I am using Keras to construct a CNN-LSTM model for tweet classification. The model has two inputs and the task is a three-class classification. The code I use to construct the model is given below:
def conv2d_lstm_with_author():
# Get the input information - author & tweet
author_repre_input = Input(shape=(100,), name='author_input')
tweet_input = Input(shape=(13, 100, 1), name='tweet_input')
# Create the convolutional layer and lstm layer
conv2d = Conv2D(filters = 200, kernel_size = (2, 100), padding='same', activation='relu',
use_bias=True, name='conv_1')(tweet_input)
flat = Flatten(name='flatten_1')(conv2d)
reshape_flat = Reshape((260000, 1), name='reshape_1')(flat)
lstm = LSTM(100, return_state=False, activation='tanh', recurrent_activation='hard_sigmoid', name='lstm_1')(reshape_flat)
concatenate_layer = concatenate([lstm, author_repre_input], axis=1, name='concat_1')
dense_1 = Dense(10, activation='relu', name='dense_1')(concatenate_layer)
output = Dense(3, activation='softmax', kernel_regularizer=regularizers.l2(0.01), name='output_dense')(dense_1)
# Build the model
model = Model(inputs=[author_repre_input, tweet_input], outputs=output)
return model
model = conv2d_lstm_with_author()
model.summary()
optimizer = Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.0)
model.compile(optimizer=optimizer, loss='categorical_crossentropy', metrics=['accuracy'])
The shape of my two inputs and label are:
author_repre_input: (40942, 100)
tweet_input: (40942, 13, 100, 1)
my label Train_Y: (40942, 3)
A snapshot of the model summary is:
When I use the following code to train the data:
model.fit([author_repre_input, tweet_input], [Train_Y], epochs=20, batch_size=32, validation_split=0.2,
shuffle=False, verbose=2)
The result keeps stucking in the first epoch and the log does not show anything useful, just:
Epoch 1/20
I am wondering why this happens. The version of tensorflow and keras I am using is:
tensorflow - 1.14.0
keras - 2.2.0
Thank you very much for your time!
Update on Jan 20...
I try to use Google Colab to train the model. I check the RAM when running the model. The Colab allocate 25G RAM for me. However, after several seconds of training, the session crashed due to ocupying all available RAM...
I think there must be something wrong with the model part...Any suggestions and insights would be appreciated!