Use TPU in Google Colab

Question

I am currently training a neural network with the help of a TPU. I changed the runtime type and initialized the TPU. I have the feeling that it is still not faster. I used https://www.tensorflow.org/guide/tpu. Did I something wrong?

# TPU initialization
resolver = tf.distribute.cluster_resolver.TPUClusterResolver(tpu='grpc://' + os.environ['COLAB_TPU_ADDR'])
tf.config.experimental_connect_to_cluster(resolver)
# This is the TPU initialization code that has to be at the beginning.
tf.tpu.experimental.initialize_tpu_system(resolver)
print("All devices: ", tf.config.list_logical_devices('TPU'))

.
.
.
# experimental_steps_per_execution = 50
model.compile(optimizer=Adam(lr=learning_rate), loss='binary_crossentropy', metrics=['accuracy'], experimental_steps_per_execution = 50)

The summary of my model

Is there anything I still have to consider or adjust?

Andrey · Accepted Answer · 2020-11-05T08:07:01.527

1

You need to create TPU strategy:

strategy = tf.distribute.TPUStrategy(resolver).

And than use this strategy properly:

with strategy.scope():
  model = create_model()
  model.compile(optimizer='adam',
                loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
                metrics=['sparse_categorical_accuracy'])

edited Nov 05 '20 at 08:07

answered Nov 05 '20 at 08:03

Andrey

5,932
3
17
35

thank you very much for the answer! And how do I create the TPU strategy? May you have a Code snippet? – Nov 05 '20 at 08:04
how do you handle this error `ResourceExhaustedError: 9 root error(s) found. (0) Resource exhausted: {{function_node __inference_train_function_14917}} Compilation failure: Ran out of memory in memory space hbm. Used 8.29G of 7.48G hbm. Exceeded hbm capacity by 825.64M.` ? – Nov 05 '20 at 08:51
1

your model is huge. Try to decrease batch_size to 8 – Andrey Nov 05 '20 at 08:55
Sorry for the trouble. I tried batch_size = 8. Unfortunately, the error keeps recurring. – Nov 05 '20 at 18:13
try batch_size = 1 – Andrey Nov 05 '20 at 18:43
you have no choice other than simplifying your model :( – Andrey Nov 05 '20 at 18:58

Use TPU in Google Colab

1 Answers1