I'm training a model with tensorflow keras and numpy input with:
epochs = 10
batch_size = 128
model.fit(
x = [train_asset_text_seq, train_bug_text_seq],
y = y_train.values.reshape(-1,1),
epochs = epochs,
batch_size=batch_size,
validation_data=([val_asset_text_seq, val_bug_text_seq], y_val.values.reshape(-1,1))
)
In order to speed the model building and evaluation up I wanted to make us of the tf.data input format. So I changed it to:
X_train_ds = tf.data.Dataset.from_tensor_slices((train_text_1, train_text_2))
y_train_ds = tf.data.Dataset.from_tensor_slices(y_train.values.reshape(-1,1))
X_val_ds = tf.data.Dataset.from_tensor_slices((val_text_1, val_text_2))
y_val_ds = tf.data.Dataset.from_tensor_slices(y_val.values.reshape(-1,1))
model.fit(
tf.data.Dataset.zip((X_train_ds, y_train_ds)).batch(batch_size).repeat(),
validation_data=tf.data.Dataset.zip((X_val_ds, y_val_ds)),
epochs = epochs,
steps_per_epoch=30
)
which seems to work for training but throws an error for validation with:
nput 0 of layer "lstm" is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: (124, 124)
Call arguments received by layer "model" " f"(type Functional): • inputs=('tf.Tensor(shape=(124,), dtype=int32)', 'tf.Tensor(shape=(124,), dtype=int32)') • training=False • mask=None
as you can see I'm using a lstm layer in the model. I also tried to change the fit call to use repeat, but that throws the same error as above:
model.fit(
tf.data.Dataset.zip((X_train_ds, y_train_ds)).batch(batch_size).repeat(),
validation_data=tf.data.Dataset.zip((X_val_ds, y_val_ds)).batch(batch_size).repeat(),
epochs = epochs,
steps_per_epoch=30,
validation_steps=30
)
Do I need to adjust the model when I want to use the tf.dataset instead of the numpy input and why is it working for training but failing for validation?
Update:
I'm building a siamese network for a text classifiaction. The model is currently defined with:
input_1 = Input(shape=(train_asset_text_seq.shape[1],))
input_2 = Input(shape=(train_bug_text_seq.shape[1],))
common_embed = Embedding(
name="synopsis_embedd",
input_dim =len(t.word_index)+1,
output_dim=EMBEDDING_DIM,
input_length=train_asset_text_seq.shape[1],
mask_zero=True
)
lstm_1 = common_embed(input_1)
lstm_2 = common_embed(input_2)
common_lstm = LSTM(32, return_sequences=True, activation="relu")
vector_1 = common_lstm(lstm_1)
vector_1 = Dropout(0.5)(vector_1)
vector_1 = Flatten()(vector_1)
vector_2 = common_lstm(lstm_2)
vector_2 = Dropout(0.5)(vector_2)
vector_2 = Flatten()(vector_2)
x5 = Lambda(cosine_distance, output_shape=cos_dist_output_shape)([vector_1, vector_2])
conc = Concatenate(axis=-1)([x5, vector_1, vector_2])
x = Dense(100, activation="relu", name='conc_layer')(conc)
x = Dropout(0.1)(x)
out = Dense(1, activation="sigmoid", name = 'out')(x)
model = Model([input_1, input_2], out)