I've problems integrating Bert Embedding Layer in a BiLSTM model for text classification task.
My dataset is in the form where each row has 2 columns: text and polarity
text = string/tweet
polarity = can be 0 or 1
So the shape of training data is (1500,2)
I am generating BERT embeddings following this code https://github.com/strongio/keras-bert/blob/master/keras-bert.ipynb
I want to add Bi-LSTM between Bert Layer and the Dense layer. I have done it like this:
# Build model
def build_model(max_seq_length):
embedding_size = 768
in_id = tf.keras.layers.Input(shape=(max_seq_length,), name="input_ids")
in_mask = tf.keras.layers.Input(shape=(max_seq_length,), name="input_masks")
in_segment = tf.keras.layers.Input(shape=(max_seq_length,), name="segment_ids")
bert_inputs = [in_id, in_mask, in_segment]
bert_output = BertLayer(n_fine_tune_layers=3, pooling="mean")(bert_inputs)
bert_output = tf.keras.layers.Reshape((max_seq_length, embedding_size))(bert_output)
bilstm = tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(128, dropout=0.2,recurrent_dropout=0.2,return_sequences=True))(bert_output)
output = tf.keras.layers.Dense(1, activation="softmax")(bilstm)
model = tf.keras.models.Model(inputs=bert_inputs, outputs=output)
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.summary()
return model
def initialize_vars(sess):
sess.run(tf.local_variables_initializer())
sess.run(tf.global_variables_initializer())
sess.run(tf.tables_initializer())
K.set_session(sess)
model = build_model(max_seq_length)
# Instantiate variables
initialize_vars(sess)
model.fit(
[train_input_ids, train_input_masks, train_segment_ids],
train_labels,
validation_data=([test_input_ids, test_input_masks, test_segment_ids], test_labels),
epochs=1,
batch_size=32
)
It gives an error: ValueError: A target array with shape (1500, 1) was passed for an output of shape (None, 256, 1) while using as loss `binary_crossentropy`. This loss expects targets to have the same shape as the output.
INFO:tensorflow:Saver not created because there are no variables in the graph to restore
INFO:tensorflow:Saver not created because there are no variables in the graph to restore
WARNING:tensorflow:From /usr/local/lib/python3.7/dist-packages/tensorflow_core/python/ops/init_ops.py:97: calling GlorotUniform.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
WARNING:tensorflow:From /usr/local/lib/python3.7/dist-packages/tensorflow_core/python/ops/init_ops.py:97: calling GlorotUniform.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
WARNING:tensorflow:From /usr/local/lib/python3.7/dist-packages/tensorflow_core/python/ops/init_ops.py:97: calling Orthogonal.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
WARNING:tensorflow:From /usr/local/lib/python3.7/dist-packages/tensorflow_core/python/ops/init_ops.py:97: calling Orthogonal.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
WARNING:tensorflow:From /usr/local/lib/python3.7/dist-packages/tensorflow_core/python/ops/init_ops.py:97: calling Zeros.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
WARNING:tensorflow:From /usr/local/lib/python3.7/dist-packages/tensorflow_core/python/ops/init_ops.py:97: calling Zeros.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
WARNING:tensorflow:From /usr/local/lib/python3.7/dist-packages/tensorflow_core/python/ops/resource_variable_ops.py:1630: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
WARNING:tensorflow:From /usr/local/lib/python3.7/dist-packages/tensorflow_core/python/ops/resource_variable_ops.py:1630: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
Model: "model"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_ids (InputLayer) [(None, 256)] 0
__________________________________________________________________________________________________
input_masks (InputLayer) [(None, 256)] 0
__________________________________________________________________________________________________
segment_ids (InputLayer) [(None, 256)] 0
__________________________________________________________________________________________________
bert_layer (BertLayer) (None, 768) 110104890 input_ids[0][0]
input_masks[0][0]
segment_ids[0][0]
__________________________________________________________________________________________________
reshape (Reshape) (None, 256, 768) 0 bert_layer[0][0]
__________________________________________________________________________________________________
bidirectional (Bidirectional) (None, 256, 256) 918528 reshape[0][0]
__________________________________________________________________________________________________
dense (Dense) (None, 256, 1) 257 bidirectional[0][0]
==================================================================================================
Total params: 111,023,675
Trainable params: 22,182,401
Non-trainable params: 88,841,274
__________________________________________________________________________________________________
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-28-827856e3678d> in <module>()
9 validation_data=([test_input_ids, test_input_masks, test_segment_ids], test_labels),
10 epochs=1,
---> 11 batch_size=32
12 )
3 frames
/usr/local/lib/python3.7/dist-packages/tensorflow_core/python/keras/engine/training_utils.py in check_loss_and_target_compatibility(targets, loss_fns, output_shapes)
739 raise ValueError('A target array with shape ' + str(y.shape) +
740 ' was passed for an output of shape ' + str(shape) +
--> 741 ' while using as loss `' + loss_name + '`. '
742 'This loss expects targets to have the same shape '
743 'as the output.')
ValueError: A target array with shape (1500, 1) was passed for an output of shape (None, 256, 1) while using as loss `binary_crossentropy`. This loss expects targets to have the same shape as the output.
What can I do to resolve this? Does it have something to do with what activation or loss is being used ? How can the shape be matched?
Any help will be appreciated.