I am trying to train a seq2seq
model for language translation, and I am copy-pasting code from this Kaggle Notebook on Google Colab. The code is working fine with CPU and GPU, but it is giving me errors while training on a TPU. This same question has been already asked here.
Here is my code:
strategy = tf.distribute.experimental.TPUStrategy(resolver)
with strategy.scope():
model = create_model()
model.compile(optimizer = 'rmsprop', loss = 'categorical_crossentropy')
model.fit_generator(generator = generate_batch(X_train, y_train, batch_size = batch_size),
steps_per_epoch = train_samples // batch_size,
epochs = epochs,
validation_data = generate_batch(X_test, y_test, batch_size = batch_size),
validation_steps = val_samples // batch_size)
Traceback:
Epoch 1/2
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-60-940fe0ee3c8b> in <module>()
3 epochs = epochs,
4 validation_data = generate_batch(X_test, y_test, batch_size = batch_size),
----> 5 validation_steps = val_samples // batch_size)
10 frames
/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/func_graph.py in wrapper(*args, **kwargs)
992 except Exception as e: # pylint:disable=broad-except
993 if hasattr(e, "ag_error_metadata"):
--> 994 raise e.ag_error_metadata.to_exception(e)
995 else:
996 raise
ValueError: in user code:
/usr/local/lib/python3.7/dist-packages/keras/engine/training.py:853 train_function *
return step_function(self, iterator)
/usr/local/lib/python3.7/dist-packages/keras/engine/training.py:842 step_function **
outputs = model.distribute_strategy.run(run_step, args=(data,))
...
ValueError: None values not supported.
I couldn't figure out the error, and I think the error is because of this generate_batch
function:
X, y = lines['english_sentence'], lines['hindi_sentence']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 34)
def generate_batch(X = X_train, y = y_train, batch_size = 128):
while True:
for j in range(0, len(X), batch_size):
encoder_input_data = np.zeros((batch_size, max_length_src), dtype='float32')
decoder_input_data = np.zeros((batch_size, max_length_tar), dtype='float32')
decoder_target_data = np.zeros((batch_size, max_length_tar, num_decoder_tokens), dtype='float32')
for i, (input_text, target_text) in enumerate(zip(X[j:j + batch_size], y[j:j + batch_size])):
for t, word in enumerate(input_text.split()):
encoder_input_data[i, t] = input_token_index[word]
for t, word in enumerate(target_text.split()):
if t<len(target_text.split())-1:
decoder_input_data[i, t] = target_token_index[word]
if t>0:
decoder_target_data[i, t - 1, target_token_index[word]] = 1.
yield([encoder_input_data, decoder_input_data], decoder_target_data)
My Colab notebook - here
Kaggle dataset - here
TensorFlow version - 2.6
Edit - Please don't tell me to down-grade TensorFlow/Keras version to 1.x
. I can down-grade it to TensorFlow 2.0, 2.1, 2.3
but not 1.x
. I don't understand TensorFlow 1.x
. Also, there is no point in using a 3-year-old version.